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ABSTRACT 


This study examined two sites from a Landsat scene of 
portions of Honduras and Nicaragua. One site was examined 
for potential water obstacles, and the other was examined 
for cover and concealment provided by vegetation. The 
results suggest that potential water obstacles can be de¬ 
tected. It is not clear if vegetative cover and concealment 
can be reliably detected. A study using better ground 
reference information than was available is necessary to 
answer that question. Several unsupervised classification 
algorithms were used and compared. A histogram clustering 
algorithm followed by a minimum distance classifier provided 
results comparable to the much slower K-means and isodata¬ 
type algorithms. Several methods to reduce the dimensional¬ 
ity of the classification problem were examined, including 
band subsets, between-band ratios, the principal component 
transformation, and the tasseled cap transformations. Band 
subsets provided adequate accuracy and is the easiest method 
to implement. 
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I 


INTRODUCTION 


A. MOTIVATION 

The Armed Forces of the United States are responsible 
for being prepared to conduct military operations anywhere 
in the world on short notice. Accurate maps are essential 
to successful military operations, so a great deal of effort 
goes into creating, maintaining, and updating maps. In 
fact, an entire Department of Defense agency, the Defense 
Mapping Agency, exists for this purpose. 

However, no matter how good a map was when it was creat¬ 
ed, it is a static entity. Once printed, it is difficult to 
update a map to reflect changed ground conditions. Also, 
seasonal variations (which can have significant effects on 
military operations) cannot always be completely included in 
a map. 

For military operations, accurate information on current 
ground conditions is needed, since ground conditions signif¬ 
icantly affect the ability of friendly and enemy forces to 
move, shoot, and communicate. The formal process of analyz¬ 
ing current ground conditions for their effects on military 
operations is called terrain analysis [Ref. l:pp. 33-40]. 

In addition to maps, terrain analysis inputs include 
information from other sources; interrogation; ground and 
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aerial surveillance and reconnaissance; imagery interpreta¬ 
tion; target acquisition and night observation devices; maps 
and charts; and studies on transportation, trafficability, 
cross-country movement, climate, and hydrography [Ref. 2:p. 
2-2], Many of these methods and sources rely on long-term 
information gathering (e.g., climate, hydrography, maps, and 
charts). The short-term information gathering methods 
(e.g., ground or aerial reconnaissance) can give the enemy 
information about planned operations simply because the 
reconnaissance effort is detected. Also, most intelligence 
collection methods are not really suited to rapid, large- 
scale data collection on current terrain conditions. 

"Satellites can cover far more territory than an air¬ 
craft, and, of equal importance, they can photograph it all 
in the same day so that intelligence staffs can see the 
overall picture." [Ref. 3:p. 230] Though the quote in its 
original context referred more to special photographic 
reconnaissance satellites, the principle is clear: how to 
rapidly gather the most current information on terrain 
conditions. 

Landsat thematic mapper (TM) imagery is used for a wide 
variety of applications, including crop yield estimation, 
forest inventory, urban land use mapping, and a variety of 
other land use/land cover and resource management applica¬ 
tions [Ref. 4:p. 1-2]. Since some of these applications are 
related to items of interest in military terrain analysis. 
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Landsat TM imagery might be usable for part of military 

terrain analysis. It has been found that 

Landsat analyses usually cost less than 1 percent of the 
cost of comparable aerial surveys. They are therefore 
particularly useful for mapping inaccessible terrain. 

[Ref. 4:p. 1-2] 

Since this is exactly the type of problem identified here, 
Landsat TM imagery may be useful in solving at least part of 
that problem. 

Another advantage of Landsat imagery is that, because of 
its mission of earth resource observation, Landsat's orbit 
is both regular and periodic. Therefore, it covers the 
earth in a predictable pattern, so it gives no evidence of 
any particular interest in a given area. Because its orbit¬ 
al inclination is 96.22°, Landsat covers the entire globe 
between about 84° North and 84° South every 16 days [Ref. 

4;p. 2-3]. 

B. OBJECTIVES 

Using Landsat TM imagery it may be possible to develop a 
computer-assisted determination of the terrain analysis 
factors of concealment and cover and certain types of obsta¬ 
cles with an acceptable degree of accuracy. Not only would 
this save time in performing terrain analysis and provide 
more information than a paper map alone; it would also 
provide much more recent information about the area of 
interest. 
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The primary research question examined in this study is: 
can unsupervised pattern recognition algorithms be effec¬ 
tively used on Landsat thematic mapper imagery to perform 
parts of the terrain analysis step of the Intelligence 
Preparation of the Battlefield process? Specifically, the 
study will focus on the identification of obstacles and 
cover and concealment. Because of the 30 x 30 meter pixel 
size, relatively small obstacles will probably not be de¬ 
tectable, but obstacles with distinctive spectral character¬ 
istics or distinctive effects on the surrounding terrain 
(like streams and swamps) may be detectable. Heavy vegeta¬ 
tion, especially woods and forests, are particularly good 
for providing concealment and some cover. Since forest 
inventory is already one of the uses of Landsat TM imagery 
[Ref. 4:p. 1-2], Landsat imagery may be useful in identify¬ 
ing forests with good military terrain properties. 

The Landsat TM imagery was analyzed primarily by using 
the Land Analysis System (LAS), a software package optimized 
for earth resource evaluation of Landsat imagery. Some 
additional short computer programs were required where LAS 
routines either did not perform exactly the required func¬ 
tion or were not convenient to use. The output of the above 
analyses was then used to produce a set of map overlays of 
the various terrain analysis factors. Producing the map 
overlays was the ultimate goal of this study. 
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C. LIMITATIONS 


Due to the location and date of the Landsat TM imagery 
(Honduras and Nicaragua, 24 March 1986) available for this 
study, gathering ground reference information was impossi¬ 
ble. Though 1:50,000 scale maps of the area were used for 
the ground reference, they may not be, in all cases, an 
adequate replacement for the on-the-ground verification of 
factors that are important in this study (e.g., density of a 
forest vs. the forest's spectral response pattern, identifi¬ 
cation of irrigated cropland and crop identification). 

The pixel siae in Landsat TM imagery is 30 x 30 meters. 
Many items of tactical interest, especially obstacles, may 
not be distinguishable at this resolution. 

This study will only look at a subset of the items of 
interest in terrain analysis for a specific small area. The 
results will mainly be an indication of whether or not more 
research will be worthwhile on this subject rather than a 
definitive answer to the posed research question. 

D. SUMMARY OF RESULTS 

The results of this study seem to indicate that poten¬ 
tial water obstacles can be identified using Landsat TM 
imagery. Several band sets and band combinations were 
evaluated for their relative usefulness in detecting poten¬ 
tial water obstacles. Several unsupervised clustering 
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algorithms were also evaluated. The results are presented 
in Chapter IV. 

A determination on the possibility of detecting vegetat¬ 
ed areas providing cover or concealment could not be made. 
The results in this area were mixed. Better ground refer¬ 
ence information is needed before a definite determination 
can be made. These results hold for all band combinations 
evaluated. 

E. ORGANIZATION 

Chapter II provides a background on the concepts and 
information that form the basis for this study. Chapter III 
provides details on how the study was conducted. Chapter IV 
presents an analysis of the results achieved in the study, 
and Chapter V draws some conclusions and recommends further 
research. 
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II 


BACKGROUND 


A. TERRAIN ANALYSIS 

Terrain analysis is part of the U.S. Army's Intelligence 
Preparation of the Battlefield (IPB) process, a formalized 
situation and target development process that provides 
commanders with the intelligence and targeting data needed 
to plan and fight battles. [Ref. l:pp. 33-40] Terrain 
analysis focuses on the military aspects of terrain and 
their effects on friendly and enemy capabilities to move, 
shoot, and communicate (the basic tactical functions). 
Terrain analysis includes the following five factors: 
observation and fields of fire, concealment and cover, 
obstacles, key terrain, and avenues of approach and mobility 
corridors. While determination of key terrain and avenues 
of approach are heavily dependent on a unit's size, mission, 
and tactical situation, the other three factors are more 
consistent. 

Since weather can also have a significant effect on 
terrain, and thus affect friendly and enemy capabilities, 
weather analysis is also an important step in IPB [Ref. l:p. 
38] . 

An obstacle map overlay is created which combines all 
terrain and weather induced obstacles identified in the 
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above analysis. [Ref. l:p. 38] Next, avenues of approach 
and mobility corridors are identified, with emphasis on 
areas where the enemy can move. The most viable avenues of 
approach and mobility corridors are identified and overlays 
are prepared depicting each one. These overlays are then 
used in the final step of IPB, threat integration. Threat 
integration integrates enemy doctrine with terrain and 
weather information to determine how the enemy will fight as 
influenced by terrain and weather. 

1. Observation and fire 

Observation is the influence of terrain on the 
ability of a force to exercise surveillance over a given 
area either directly or through the use of sensors. [Ref. 
2:p. 2-12] Characteristics of terrain which restrict obser¬ 
vation include hills, cliffs, vegetation, and manmade fea¬ 
tures. 

Fire is the influence of terrain on the effective¬ 
ness of direct and indirect fire weapons. 

Indirect fire weapons such as mortars are affected 
primarily by terrain conditions within the target area 
which may influence the terminal effect of the projec¬ 
tile. Fields of fire for direct fire weapons such as 
machineguns and automatic rifles are primarily affected 
by terrain conditions between the weapon and the target. 
[Ref. 2:p. 2-12] 

2. Concealment and cover 

Concealment provides protection from observation. 
Cover provides protection from the effects of weapons fire. 
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[Ref. 2:p. 2-12] Concealment may be provided by terrain 
features such as woods, underbrush, tall grass, and culti¬ 
vated vegetation. 

Concealment from ground observation does not neces¬ 
sarily provide concealment from air obseirvation or 
from electronic or infrared devices. Concealment 
does not necessarily provide cover. [Ref. 2:p. 2-12] 

Cover may be provided by trees, rocks, ditches, 
folds in the ground, buildings, embankments and similar 
features. [Ref. 2:p. 2-12] "Areas that provide cover from 
direct fires may or may not protect against the effects of 
indirect fire; however, most terrain features that afford 
cover also afford concealment.” [Ref. 2:p. 2-13] 

3. Obstacles 

"An obstacle is any natural or artificial terrain 
feature which stops, impedes, or diverts military movement." 
[Ref. 2;p. B-3] Natural obstacles include rivers, streams, 
lakes, swamps, steep slopes, dense woods, deserts, moun¬ 
tains, cities, and certain types of unstable soil. Artifi¬ 
cial obstacles are works of construction and destruction 
executed to stop, impede, or divert military movement. They 
include minefields, craters, antitank ditches, trenches, 
roadblocks, deliberately flooded areas, extensive rubble, 
and forest fires. 

4. Key Terrain 

A key terrain feature is any location or area whose 
seizure or control affords a marked advantage to either 
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opposing force. [Ref. 2:p. 2-13] Types of terrain features 
which are frequently selected as key terrain for tactical 
units include high ground that provides favorable observa¬ 
tion and fire over a significant portion of the operation 
area and bridges over unfordable rivers. 

5. Avenues of Approach and Mobility Corridors 

An avenue of approach is a route for a force of a 
particular size to reach an objective. [Ref. 2:p. 2-14] 

The analysis of an avenue of approach at any level of com¬ 
mand is based on the consideration of observation and fire, 
cover and concealment, obstacles, utilization of key ter¬ 
rain, adequate maneuver space, and ease of movement. 

A mobility corridor is that part of an avenue of 
approach that allows a particular-sized unit to deploy in 
its doctrinal tactical formation [Ref. l:p. 38]. 

This study focused on the military terrain classifi¬ 
cations of water obstacles and cover and concealment as 
provided by vegetation. 

B. LANDSAT 

1. General Information 

The Landsat series of satellites began with the 
Earth Resources Technology Satellite (ERTS), launched in 
July 1972. [Ref. 4:p. 1-1] ERTS was renamed Landsat 1 in 
1975 to reflect its primary use as a land resources observa¬ 
tory. 
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The second generation of Landsat satellites (4 and 
5) carried the thematic mapper (TM) in addition to the 
multispectral scanner (MSS) of the earlier Landsat satel¬ 
lites. [Ref. 4:p. 2-1] The TM improved on both the spec¬ 
tral (seven bands vs. three) and spatial (30 m vs. 82 m) 
resolution of the MSS. 

Landsat 4, launched in July 1982, developed early 
communication and solar array problems that restricted it to 
use of the MSS only. [Ref. 4:p. 2-1] Landsat 5 was 
launched in March 1984 and is currently the only source of 
TM imagery. 

As the earth turns below the orbiting Landsat space¬ 
craft, the TM and MSS scan the ground directly beneath in a 
fixed width swathing pattern perpendicular to the direction 
of the orbit. [Ref. 4:pp. 2-2 to 2-3] Both sensors have a 
185 kilometer east-to-west swathing pattern. Swaths are 
designed to overlap for complete surface coverage. Landsat 
5 circles the earth in a sun-synchronous, near-polar orbit 
at an altitude of approximately 705 km. The ground track is 
repeated in a 16-day cycle totaling 233 orbits. Since the 
inclination of the orbit is 96.22°, Landsat can cover the 
entire globe between about 84° North and 84° South every 16 
days. These characteristics were chosen to satisfy the need 
for near-constant resolution, periodic observations of the 
same site, and for moderately constant illumination. Land- 


11 






sat 5 crosses the equator on a north-to-south (daylight) 
path at approximately 9:45 a.m. local time each morning. 

At the equator, adjacent swaths overlap by approxi¬ 
mately 7 percent. [Ref. 4:p. 2-3] This overlap increases 
as the satellite moves toward either pole because the orbit 
paths converge with increasing latitude. 

2. Thematic Mapper 

The thematic mapper (TK) is a scanning optical- 
mechanical sensor operating in the visible and infrared 
wavelengths. [Ref. 4:p. 3-1] It contains a scan mirror 
assembly that projects reflected earth radiation onto detec¬ 
tors arrayed in two focal planes. The TM uses the forward 
motion of the spacecraft for along-track scan and uses a 
moving mirror assembly for the cross-track (perpendicular to 
the spacecraft) direction. 

The seven TM spectral bands were selected for their 
value in discriminating vegetation type and vigor, measuring 
plant and soil moisture, differentiating clouds and snow, 
and identifying hydrothermal alteration in certain types of 
rock [Ref. 5:p. 32]. Table 1 [Ref. 5:p. 30] lists the 
characteristics of the TM sensor and Table 2 [Ref. 4:p. 3- 
2,6:p. 86] lists some of the applications of the TM spectral 
bands. 

The TM bands are numbered out of the order of the 
wavelength intervals covered. [Ref. 6:p. 85] The wave- 









TABLE 1. THEMATIC MAPPER SENSOR SYSTEM CHARACTERISTICS 
AFTER REF. 5:P. 30 


Radiometric 

Band Sensitivity 


Number 

Micrometers 

rNEAP^ 

1 

0.45 

-0.52 

0.8 

2 

0.52 

- 0.60 

0.5 

3 

0.63 

- 0.69 

0.5 

4 

0.76 

- 0.90 

0.5 

5 

1.55 

- 1.75 

1.0 

7 

2.08 

- 2.35 

2.4 

6 

10.4 

- 12.50 

0.5K (NEAT) 


Note: Radiometric sensitivities are the noise-equiva¬ 
lent radiance differences for the reflective bands 
expressed as a percentage and as a temperature differ¬ 
ence for the thermal infrared band. 


length interval for band 7 falls between the wavelength 
intervals covered by bands 5 and 6. This is because band 7 
was added to the TM after the other six bands were defined, 
and a decision was made not to renumber the bands. 

The instantaneous field of view (IFOV) for bands 1 
through 5 and band 7 (the reflective bands) is equivalent to 
a 30 meter square when projected to the ground. [Ref. 4:p. 
3-1] Band 6, the thermal infrared band, has an IFOV equiva¬ 
lent to a 120 meter square. These data are resampled 
during geometric processing to produce 28.5 meter and 120 
meter IFOVs for the reflective and thermal infrared bands, 
respectively. 

Classification accuracy becomes acceptable for most 
remote sensing agriculture and forestry applications when 
field sizes are greater than 60 IFOVs. [Ref. 5:p. 33] This 
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TABLE 2. CHARACTERISTICS OF THE THEMATIC MAPPER BANDS. 
AFTER REF. 4:P. 3-2, 6:P. 86 


Band Wavelength, urn _ Characteristics 


1 0.45 - 0.52 

2 0.52 - 0.60 

3 0.63 - 0.69 

4 0.76 - 0.90 

5 1.55 - 1.75 

6 10.40 - 12.50 

7 2.08 - 2.35 


Blue-green. Maximum penetration of water, which is 
useful for bathymetric mapping in shallow water. Useful 
for distinguishing soil from vegetation and deciduous 
from coniferous plants. 

Green. Matches green reflectance peak of vegetation, 
which is useful for assessing plant vigor. 

Red. Matches a chlorophyll absorption band that is 
important for discriminating vegetation types. 

Reflected IR. Useful for determining biomass content and 
for water body mapping. 

Reflected IR. Indicates moisture content of soil and 
vegetation. Penetrates thin clouds. Good contrast 
between vegetation types. Useful for snow/cloud differ¬ 
entiation. 

Thermal IR. Nighttime images are useful for thermal 
mapping and for estimating soil moisture. 

Reflected IR. Coincides with an absorption band caused 
by hydroxyl ions in minerals. Ratios of bands 5 and 7 
are potentially useful for mapping hydrothermally altered 
rocks associated with mineral deposits. 


corresponds to a square of about 8 IFOVs on a side. There¬ 
fore, classification accuracy should be acceptable for 
fields 240 X 240 m in size if a 30 x 30 m ground IFOV is 
used. This should allow most of the field sizes in Canada, 
the United States, and the Soviet Union to be adequately 
sampled in the spatial domain. It should also provide some 
information for fields in developing countries where field 
sizes are less than 240 x 240 m. 

There are some known limitations of the various TM 
bands. At high sun angles, bands 5 and 7 saturate over 
bright areas such as sandy beaches [Ref. 4:pp. 2-3 to 2-5]. 
The time of the Landsat overpass is too early in the day to 
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record maximum thermal contrast, which occurs in the early 
afternoon. Preliminary studies suggest that band 6 does not 
significantly enhance the accuracy of the usual land cover 
analysis [Ref. 7:p. 220]. 

3. Radiometric and Geometric Correction of image Data 

Landsat digital image data transmitted from the 
satellite have some degree of distortion because of charac¬ 
teristics of the sensing and recording systems as well as 
atmospheric and scene conditions. [Ref. 4;pp. 4-2 to 4-3] 
Radiometric distortion is caused by blurring effects of the 
sensor, transmission noise, atmospheric interference, vari¬ 
able surface illumination, and changes in surface radiance 
due to changes in the viewing angle. Geometric distortion 
results from spacecraft effects, such as attitude and alti¬ 
tude changes; earth effects caused by its curvature, rota¬ 
tion, and terrain relief; and temporary aberrations in the 
scanning system. 

Radiometric corrections account for errors in the 
image pixel radiance values caused by changing sensor char¬ 
acteristics. [Ref. 4:p. 4-4] The sensors have internal 
calibration lamps, which were calibrated before launch. 

They are used to calibrate detector gains and biases. Data 
from these lamps can be used to track overall sensor re¬ 
sponse and identify drift away from nominal performance for 
each detector. Radiometric corrections are made on a band- 
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by-band basis. Different algorithms for estimating gain 
and bias are applied to the six reflective bands and to the 
thermal band. 

Remotely sensed data usually contain both systematic 
and nonsystematic geometric errors. [Ref. 5;p. 102-103] 
Systematic errors can be corrected using data about the 
satellite's position and orientation and knowledge of the 
internal sensor distortion. Nonsystematic errors cannot be 
corrected with acceptable accuracy without a sufficient 
number of ground control points. A ground control point is 
a point on the surface of the earth where both image coordi¬ 
nates and map coordinates can be identified. 

After the systematic errors have been corrected, 
some slight geometric distortion remains because of uncer¬ 
tainties in spacecraft position and orientation. [Ref. 4;p. 
4-5] This distortion is normally acceptable, but the dis¬ 
tortion can be removed through the use of ground control 
points. 

C. ENERGY INTERACTIONS 

When electromagnetic energy is incident on any earth 
surface feature, three fundamental energy interactions are 
possible. [Ref. 8:p. 11] The energy can be reflected, 
transmitted, or absorbed. Using the principle of conserva¬ 
tion of energy, the relationship between these energy inter¬ 
actions is 
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EjiX) - EJ,(X)■>■EJ^(k)+EJ.{k) (1) 

Where E, is the incident energy, is the reflected energy, 
E^ is the absorbed energy, and E^ is the transmitted energy. 
All energy components are a function of the wavelength, X. 

It should be noted that the proportions of energy re¬ 
flected, absorbed, and transmitted vary for different earth 
surface features, depending on the specific material type 
and condition. [Ref. 8:p. 12] Wavelength dependency means 
that, for a given surface feature, the proportions of energy 
reflected, absorbed, and transmitted will vary at different 
wavelengths. 

An earth surface feature can be characterized by measur¬ 
ing the fraction of the incident energy that is reflected. 
[Ref. 8;p. 13] This quantity is called the spectral re¬ 
flectance, pj^, and is a function of wavelength. It is 
defined as 

where is expressed as a percentage. 

A graph of the spectral reflectance of an earth surface 
material as a function of wavelength is called a spectral 
reflectance curve. [Ref. 8:p. 13] The spectral reflectance 
curve can give insight into the spectral characteristics of 
a surface material and has a strong influence on the choice 
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of wavelengths in which remote sensing data are acquired for 
a particular application. 

Many earth surface features can be identified on the 
basis of their spectral characteristics [Ref. 8:p. 15]. 

Some features of interest cannot be spectrally separated. 

The success of multispectral image analysis depends on two 
factors: that any surface feature (e.g., a field of wheat) 
will have a different radiance at one wavelength than at 
another, providing that the difference between the two 
wavelengths is sufficiently large; and that no two dissimi¬ 
lar surface features will have the same radiance at both 
wavelengths [Ref. 9:p. 363]. 

Figure 1 shows typical spectral reflectance curves for 
three basic types of earth surface feature: healthy green 
vegetation, dry bare soil, and clear lake water. [Ref. 8;p. 
15] The curves represent average values for these material 
types. The reflectance of individual features can vary 
considerably above and below the average. 

D. PATTERN RECOGNITION 

1. General 

"The signals from a given sensor can be thought of 
as defining a multi-dimensional space where each sensor band 
corresponds to a dimension." [Ref. 10:p. 81] The boundaries 
of that space are then defined by the minimum and maximum 
possible values in each of the bands. The basic pattern 
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Figure 1. Typical Spectral Reflectance Curves for Vegeta 


tlon. Soil, and Water. From Ref. 8:p. 15 

recognition problem is to determine the information class or 
category of each distinct region on the ground using the set 
of sensor measurements and to estimate the error rate for 
the class assignments [Ref. ll:p. 793]. 

Information classes are those defined by man [Ref. 
5:p. 179]. Information classes can be land use or land 
cover types which are of interest to the user of the final 
classification product. Conversely, spectral classes are 
those that are inherent in the remote sensor measurement 
space. Spectral classes are only of interest to the extent 
that they can be matched to one or more information classes 
[Ref. 7:p. 308]. Often spectral classes do not match di¬ 
rectly to information classes because of the effect of mixed 
pixels (i.e., pixels containing more than one class) and 









because of spectral diversity in nominally uniform informa¬ 
tion classes (i.e., one information class cc'*responds to 
several spectral classes) [Ref. 7:pp. 308-309]. 

The sensor may not gather sufficient information to 
allow discrimination to take place between the classes of 
interest. [Ref. ll:p. 795] In these cases one may be 
forced to define more discernable classes even though they 
may be of less interest to the user of the final product. 

To help determine the discernable classes, one can employ an 
unsupervised classification or clustering process which can 
identify what the naturally distinguishable classes are from 
the sensor’s data. 

If the individual classes of the patterns are al¬ 
ready known, then one has a supervised pattern recognition 
problem. [Ref. 12:pp. 1-2] In supervised pattern recogni¬ 
tion a portion of the set of known patterns is extracted and 
used to derive a pattern classification algorithm. These 
patterns are called the training set. The remaining known 
patterns are then used to test the classification algorithm. 
These patterns are referred to as the test set. Since the 
correct class of each of the patterns in the test set are 
known, one can evaluate the performance of the algorithm. 
Once a desired level of performance is achieved, which is 
normally measured in terms of the misclassification rate, 
the algorithm can be used on initially unlabeled patterns. 
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If the classes, and perhaps even the number of 
classes, of the available patterns are unknown, then one has 
an unsupervised pattern recognition or clustering problem 
[Ref. 12:p. 1-2]. In clustering problems, one attempts to 
find classes of patterns with similar properties [Ref. 12:p. 
215]. Similarity is often defined as proximity of the 
points in multispectral space according to a distance mea¬ 
sure [Ref. 12:p. 216]. 

There are many reasons why pattern recognition 
provides an ideal approach to the problem of dividing an 
image into its spectral or information classes. [Ref. 13:p. 
136] Since pattern recognition is computer-oriented, it 
allows for rapid and repeatable analysis and a statistical 
treatment of multivariate data. It is easily tailored to a 
wide range of problems, and it produces quantitative re¬ 
sults. Pattern recognition is most applicable when the goal 
is to categorize or classify each elementary observation 
into one of a limited number of discrete classes. 

2. Supervised Classification 

In a supervised classification, the identity and 
location of some of the land cover types, such as urban, 
agriculture, wetland, and forest, are already known through 
a combination of field work, analysis of aerial photography, 
maps, and personal experience. [Ref. 5:pp. 177-178] With 
this knowledge, one attempts to locate specific sites in the 
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image that represent homogeneous examples of the known land 
cover types. These areas are commonly referred to as train¬ 
ing sites because the spectral characteristics of these 
known areas are used to "train” the classification algo¬ 
rithm. The classifier is then used to assign every pixel in 
the image to the class which it has the greatest likelihood 
of being a member. 

"To yield acceptable classification results, train¬ 
ing data must be both representative and complete." [Ref. 
8:p. 678] This means that all spectral classes constitut¬ 
ing each information class must be adequately represented in 
the training data for a supervised classification to produce 
acceptable results [Ref. 8:p. 679]. 

Since the information to develop complete training 
data was not available, this study used unsupervised classi¬ 
fication methods. 

3. Unsupervised Classification 

In an unsupervised classification, the identities of 
the classes of land cover types within a scene are not known 
beforehand because adequate ground information is lacking or 
surface features within the scene are not well defined [Ref. 
5:p. 178]. Clustering algorithms are used to search for 
"natural" groupings of the pixels in multispectral feature 
space [Ref. 5:p. 215]. Once the data are classified, one 
attempts to assign these "natural" or spectral classes to 
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the information classes of interest. It is usually neces¬ 
sary to combine some of the clusters, since one information 
class may be composed of more than one spectral class [Ref. 
5:p. 219]. Also, some of the clusters may be less meaning¬ 
ful because they represent mixed classes of earth surface 
materials [Ref. 5:p. 215]. 

Clusters are generally defined as groups of points 
that are "similar” according to some measurement criteria. 
[Ref. 12:p. 216] Usually, "similarity" is defined as prox¬ 
imity of the points in multispectral space according to a 
distance measure. 

There are several reasons for interest in using 
unsupervised pattern recognition. [Ref. 14:p. 67] The 
collection and labeling of a large set of sample patterns 
can be very costly and time consuming. In many applica¬ 
tions, the characteristics of the patterns can change slowly 
with time. In the early stages of an investigation it may 
be valuable to gain some insight into the nature or struc¬ 
ture of the data. 

One of the primary advantages of unsupc*T-vised clas¬ 
sification is that the classifier identifies the distinct 
spectral classes present in the image data. [Ref. 8:p. 685- 
686] Many of these classes might not initially be apparent 
to an analyst applying a supervised classifier. Also, the 
spectral classes in a scene might be so numerous that it 
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would be difficult to train on all of them. In an unsuper¬ 
vised approach they are found automatically. 

There are several other advantages of unsupervised 
pattern recognition. [Ref. 7:p. 299] The classes defined 
by unsupervised classification are often much more uniform 
with respect to spectral composition than are those generat¬ 
ed by supervised classification. Unique classes are recog¬ 
nized as distinct units. No extensive prior knowledge of 
the region is required, and the opportunity for human error 
is minimized. 

A serious disadvantage of unsupervised classifica¬ 
tion is that clear matches between spectral and information 
classes are not always possible. [Ref. 7:p. 309] Some 
information classes may not have direct spectral counter¬ 
parts, and vice versa. Also, comparing the classification 
results from different regions or dates may require the same 
set of information categories. This is easily handled in 
supervised classification by appropriate selection of train¬ 
ing sites. This may be difficult to do with unsupervised 
classification, however, since there is no provision in 
unsupervised pattern recognition to use information from 
outside of the image being classified or to define training 
sites. 
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E. FEATURE SELECTION 

The LANDSAT thematic mapper (TM) acquires images in 
seven spectral bands. Because of the amount of data and the 
related processing time, subsets or transformations of the 
seven bands are often used to reduce the dimensionality of 
the data and thus reduce the computation time of the classi¬ 
fication problem. 

Generally, the more bands analyzed in a classification 

problem, the greater the cost and perhaps the greater the 

amount of redundant spectral information being used. [Ref. 

5:p. 198] Therefore, a basic problem in multispectral 

pattern recognition is to find a technique that will allow 

separation of the major land cover classes with a minimum of 

error and a minimum number of bands. 

A judgement must be made to determine those bands that 
are most effective in discriminating each class from all 
others. This process is commonly called feature selec¬ 
tion. The goal is to delete from the analysis those 
bands that provide only redundant spectral information. 

In this way the dimensionality (i.e., the number of 
bands to be processed) in the data set may be reduced. 

This process minimizes the cost of the digital image 
classification (but hopefully, not the accuracy). [Ref. 
5:p. 189] 

A feature or feature vector can be any mathematical 
transformation of the band measurements. [Ref. 13:p. 175] 
Transformations that are often used in remote sensing appli¬ 
cations include band subsets, between band ratios, and 
linear transformations. Though many linear transformations 
are possible, attention was restricted to two of the more 
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common ones, the principal component transformation and the 
so-called tasseled cap transformation. All of these methods 
of reducing data dimensionality were used in this study, 
though the band ratio method was only used with the CORINTO 
site. 

1. Band Stibsets 

The simplest method of reducing the dimensionality 
of the original multispectral data is to select only a 
subset of the available bands for use in pattern recogni¬ 
tion. "Generally, the best three-band combinations include 
one of the visible bands (TM 1, 2, or 3) and one of the 
longer-wavelength infrared bands (TM 5 or 7) together with 
TM band 4." [Ref. 5:p. 91] 

Studies frequently use one of the visible bands, one 
of the mid-IR bands, and the near-IR band, band 4, to reduce 
the dimensionality while retaining a "maximum" amount of 
information. Band 6 (the thermal IR band) is often not used 
because of its different spatial resolution. 

Thompson and Henderson [Ref. 15] used the band set 
(4 5 7) to investigate soil properties under grassland 
vegetation. Crippen [Ref. 16] claims that the band set (1 4 
7) is commonly the combination of choice based on qualita¬ 
tive evaluations for both barren and vegetated areas. 

Karaska et al. [Ref. 17] found the band set [(1, 2, or 3) 4 
5] to be useful in distinguishing forest cover types. 
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2. Band Ratios 


Sometimes differences in brightness from similar 
surface materials are caused by topographic conditions, 
shadows, or seasonal changes in sunlight illumination. 

[Ref. 5:p. 135] These conditions may hamper the ability of 
a classification algorithm to identify the surface materials 
in a remotely sensed image. Ratio transformations of the 
remotely sensed data can, in certain instances, be used to 
reduce the effects of such conditions. These ratios may 
also provide unique information not available in any single 
band that is useful in discriminating between soil and 
vegetation. 

Ratios can also be useful in reducing a condition 
called the "topographic effect,” which is manifested in 
Landsat images by the visual appearance of terrain rugged¬ 
ness. [Ref. 18;p. 115] The topographic effect is caused by 
differential spectral radiance due to surface slope and 
aspect variations. 

When using a simple ratio, division by zero is 
possible, and ratios less than one are common. [Ref. 8:p. 
655] Rounding to the nearest integer will compress much of 
the ratio into gray levels 0 or 1. One means of solving 
this problem is to define a new gray scale value using the 
equation 
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( 3 ) 


g' - /r arc tan ( ) 

^bandZ 

where G' is the new gray scale value and K is a scaling 
factor calculated to place the ratio values in the proper 
integer range. For positive values of G, and G^, the ratio 
G,/G 2 will range from 0 to infinity and the arc tangent will 
range from 0 to ir/2. Therefore, for an eight-bit display, a 
value of 162.3 is appropriate for K. The value of G' will 
then range from 0 to 255. 

The ratio image has several useful properties. 

Since the relationship holds for both shadowed and directly 
illuminated pixels, the ratio image shows pure reflectance 
information without the effects of topography. [Ref. 7:p. 
454] This result allows one to examine the reflectance 
properties of surfaces without the confusing effects of 
mixed brightness of topography and material reflectance. 

Ratioed images are often useful for discriminating 
subtle spectral variations in a scene that are masked by the 
brightness variations in the individual spectral bands. 

[Ref. 8:p. 650] Ratioed images portray variations in the 
slopes of the spectral reflectance curves between the two 
bands involved, regardless of the absolute reflectance 
values observed in the bands. However, since ratio images 
are intensity blind, dissimilar materials with different 
absolute radiances but having similar slopes of their spec- 
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tral reflectance curves may appear identical in the ratio 
image [Ref. 8:p. 654]. 

A problem with band ratios is that severe atmospher¬ 
ic effects, if present, can differ from one band to the 
other. [Ref 7:pp. 457-458] The value of the band ratio will 
no longer portray only the spectral properties of the ground 
surface. It will have values greatly altered by the varied 
atmospheric contributions to the separate bands. 

3. Principal Component Transformation 

Extensive interband correlation is often encountered 
in the analysis of multispectral image data. [Ref. 8:p. 
655-656] The images generated from the various spectral 
bands appear similar and convey much of the same informa¬ 
tion. The purpose of the principal component transformation 
is to compress the information contained in the original set 
of n bands into a fewer number of bands or components. The 
components are then used instead of the original data. 

The principal component transform (also known as the 
Hotelling, eigenvector, or discrete Karhunen-Loeve transform 
[Ref. 19:p. 122]) transforms a correlated set of multispec¬ 
tral image data into an uncorrelated data set with certain 
ordered variance properties [Ref. 5:p. 151]. The choice of 
the basis vectors for the transform is made so that these 
vectors point in the direction of the maximum variance of 
the data, subject to the constraint that all of the vectors 
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be mutually orthogonal and the transformed components be 
uncorrelated [Ref. 19:p. 125-126]. 

The principal component vectors are computed at each 
pixel from the original set of n image bands by the trans¬ 
formation 

y - (4) 

where x is the (n x 1) vector of gray scale values for each 
pixel, is the mean vector of the image, T is an (n x n) 
orthogonal transformation matrix, the rows of which are the 
normalized eigenvectors of the image covariance matrix 
arranged with the eigenvalues in descending order, and y is 
the vector of principal components, which is calculated 
independently for each pixel. [Ref. 20:pp. KARLOV-2 to 
KARLOV-3] 

Since processing cost is dependent on the dimension¬ 
ality of input to the pattern recognition algorithm, the 
usual procedure is to select a subset of the principal 
component vector for further processing [Ref. 19:p. 325]. 
Sine V the components are ranked so that each component has a 
variance less than the previous component, a reduction in 
the effective number of bands can occur, since the higher 
numbered components will contain less information [Ref. 

21:p. 220]. 

The eigenvalues of the transformation also contain 
useful information. [Ref. 5;p. 154] It is possible to 
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determine the percent of total variance explained by each of 


the principal components, %p, using the equation 


% 


p 


Xp X 100 


n 



(5) 


where A,p is the pth eigenvalue out of the possible n eigen¬ 
values. 

4. Tasseled Cap Transformation 

A principal component transformation can fail to 

capture the complex structure of Landsat TM data and is 

extremely scene dependent [Ref. 22:p. 262]. 

The TM tasseled cap transformation, on the other hand, 
specifically emphasizes the inherent data structures, 
and is intended to be an invariant transformation which 
can therefore be applied to any TM scene (although 
atmosphere and illumination geometry will affect re¬ 
sults, as may substantial deviation from a mid-latitude, 
temperate environment). [Ref 22:p. 262] 

The analysis of remotely sensed data can be thought 
of as a three-step process. [Ref. 22:p. 256] The first 
step is to understand the relationship among the sensor 
bands for the scene classes of interest. The second step is 
to compress the number of spectral bands into a manageable 
number of features, and the third step is to extract physi¬ 
cal scene characteristics from the spectral features. The 
principal component transformation provides data volume 
reduction, but it presents significant obstacles with regard 
to physical interpretation of the derived features and 
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comparisons between dates or scenes. The tasseled cap 
transformation accomplishes all of these functions. 

The TM tasseled cap transformation Is a linear 
transformation that rotates the six TM reflective bands into 
TM tasseled cap coordinates. [Ref. 22:pp. 256-257] The 
components of the transformation matrix are given in Table 
3. 

TABLE 3. THEMATIC MAPPER TASSELED CAP COEFFICIENTS. FROM 
REF. 22:P. 257 


Feature 

Band 1 

Barxi 2 

Band 3 

Band 4 

Band 5 

Band 7 

Brightness 

0.3037 

0.2793 

0.4743 

0.5585 

0.5082 

0.1863 

Greenness 

-0.2848 

-0.2435 

-0.5436 

0.7243 

0.0840 

-0.1800 

Wetness 

0.1509 

0.1973 

0.3279 

0.3406 

-0.7112 

-0.4572 

Fourth 

•0.8242 

0.0849 

0.4392 

-0.0580 

0.2012 

-0.2768 

Fifth 

-0.3280 

0.0549 

0.1075 

0.1855 

-0.4357 

0.8085 

Sixth 

0.1084 

-0.9022 

0.4120 

0.0573 

-0.0251 

0.0238 


The data in the six TM reflective bands were found 
to primarily occupy a three-dimensional space defined by two 
perpendicular planes and a transition region between them. 
[Ref. 10:p. 84-85] One plane, the plane of vegetation, 
contains fully-vegetated samples, while the other plane, the 
plane of soils, contains bare soil samples. Samples that 
contain both soil and vegetation fall in the transition 
region between the two planes. These features typically 
capture 95 percent or more of the total variation in TM 
images. 
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The three basic features of the TM tasseled cap are 
called greenness, brightness, and wetness. "Brightness" is 
a weighted sum of all six reflective TM bands. [Ref. 22:p. 
257-259] It is responsive to changes in total reflectance 
and to those physical properties that affect total reflect¬ 
ance. "Greenness" is a contrast between the sum of the 
visible bands and the near-infrared band. (The two mid- 
infrared bands essentially cancel each other.) "TM green¬ 
ness responds to the combination of high absorption in the 
visible bands (due to plant pigments and particularly chlo¬ 
rophyll) and high reflectance in the near-infrared (due to 
internal leaf structure and the resultant scattering of 
near-infrared radiation) which is characteristic of green 
vegetation." [Ref. 22:p. 259] "Wetness" contrasts the sum 
of the visible and near-infrared bands with the sum of the 
mid-infrared bands. The name wetness was chosen because the 
mid-infrared bands have been suggested to be most sensitive 
to both soil and plant moisture. 

Brightness defines the intersection between two 
perpendicular planes, the plane of vegetation and the plane 
of soils. [Ref. 22:p. 258-261] The plane of vegetation is 
defined by brightness and greenness, the plane of soils is 
defined by brightness and wetness, and the transition zone 
between the two planes is defined by greenness and wetness. 
The final three features contain the residual variation of 
the scene. 
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The tasseled cap transformation presents TM data in 
a more accessible fashion by changing the viewing perspec¬ 
tive. [Ref. 22:p. 262] It reduces the data volume by 
concentrating the majority of data variability in three 
features. By making a direct link between the features and 
the physical scene characteristics it enhances both the 
interpretation of observed spectral variation and the pre¬ 
diction of the spectral effects of particular changes in 
scene characteristics. 

An agricultural field can be used to provide an 
example of the uses of the tasseled cap transformation. 

[Ref. 5:p. 165] During a growing season, the field is 
expected to begin near the plane of soils, move through the 
transition zone as the crop grows, arrive at the plane of 
vegetation near the end of crop development, and then move 
back toward the plane of soils during harvest or senescence. 

All of the methods of reducing data dimensionality 
mentioned above (band subsets, band ratios, the principal 
component transformation, and the tasseled cap transforma¬ 
tion) were used in this study. The band-ratio method was 
only used with the CORINTO site. 
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METHODOLOGY 


A. SELECTION OF STUDY SITES 
1. Site selection 

The scene used in this study was acquired by the 
Landsat 5 thematic mapper on 24 March 1986 and covered the 
area shown in the large box in Figure 2 (path 17, row 51, 
scene identification 507531500). It was obtained on CCT-P 
computer-compatible tapes, which have been resampled to 28.5 
X 28.5 m pixels in the reflective bands and 120 x 120 m 
pixels in the thermal-infrared band. [Ref 4:p. 4-5] The 
images were radiometrica]ly and geometrically corrected, as 
far as available information allowed. 

Since a Landsat scene covers an area of 185 x 185 
km, a significant reduction in area was necessary to achieve 
a study site of a manageable size. First, a one-quarter 
scale photomosaic of the entire Landsat scene was made by 
photographing screen images of portions of the scene. 
Cloud-free areas of the scene were identified as potential 
sites for further study. Maps of 1:50,000 scale for por¬ 
tions of the scene were then obtained from the Defense 
Mapping Agency. These maps were examined for a variety of 
terrain types and for ease in registering the maps to the 
Landsat scene. 
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Figure 2 
23:p. 61 


Area Covered by Landsat Scene 


After Ref. 


Because of the variety of terrain types available in 
a relatively small area, attention was focused on an area 
covered by six map sheets. This area is approximately 72 x 
36 km in size. It lies completely in Nicaragua and runs 
roughly from Corinto on the Pacific coast to the western 
shore of Lake Managua, including the cities of Leon and 
Chinandega and part of the Cordillera de los Marrabios chain 
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of volcanoes, including the volcanoes Telica and Momotombo. 
The smaller box in Figure 2 shows the six-map area. 

From the six-map area, two 512 x 512 pixel (14.6 x 
14.6 km) sites were selected for detailed study. These two 
areas together contain a variety of terrain types. The 
first area, called CORINTO, contains the port city of Corin- 
to, a river/estuary system with extensive mangrove swamps, 
some streams, and agricultural land. The second area, 
called HALF, lies east of the town of Malpaisillo and con¬ 
tains some smaller streams and a variety of vegetative cover 
types such as woodland, scrub, and agricultural land. The 
boxed areas in Figure 3 show the approximate site bound¬ 
aries. 

2. Geography of the Area 

Nicaragua can be divided into three major regions: 
the drier, fertile Pacific region and Great Rift Valley; the 
wetter cooler Central Highlands; and the hot and humid 
Atlantic Coast region. [Ref. 24;p. 66] The six-map area 
lies completely in the first region. 

Western Nicaragua is marked by a line of young 
volcanoes running between the Gulf of Fonseca and Lake 
Nicaragua. [Ref. 24:p. 66] Many of these volcanoes are 
still active. These volcanic peaks protrude from a large 
crustal fracture or rift that forms a long, narrow 
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Figure 3. 
23:p. 61 


Approximate Study Site Boundaries. After Ref. 


depression running southeast from the Gulf of Fonseca to the 
Rio San Juan drainage. 

Surrounding the lakes and extending northwest of 
them to the Gulf of Fonseca are fertile lowland plains 
highly enriched with volcanic ash. [Ref. 24:p. 66] These 
lowlands are densely populated and well cultivated. The 

















rivers in this area are short and carry a small volume of 
water. The soil is volcanic and 85% of the area is fertile. 

Mean annual precipitation for these plains and the 
flanking uplands ranges from 100 to 150 centimeters. [Ref. 
24 ;p. 66] Rainfall is usually seasonal. May through 
October is the rainy season, and December through April is 
the driest period. 

3. The CORIMTO Site 

The CORINTO site includes the city of Corinto and 
the area to the northeast. Figure 4 is a map of the site, 
with the dark rectangle marking the approximate site bound¬ 
ary. The dark areas around the rivers in the lower left 
part of the map represent mangrove swamp; the horizontal 
dashed lines at the edge of parts of the mangrove represent 
areas that are subject to inundation; and the pattern of 
lines in the right half of the map represent small irriga¬ 
tion works. The lower and upper portions of the map come 
from two different map sheets. The seven original TM band 
images of the site can be found in Appendix A. 

The statistics of the CORINTO image are presented in 
Table 4. The three visible bands (1, 2, and 3) exhibit a 
considerable degree of redundancy; their correlations range 
from 0.89 to 0.94. There is also considerable redundancy 
between the mid-infrared bands (5 and 7), with a correlation 
of 0.91. There are redundancies between the visible bands 
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Figure 4. Map of the CORINTO site. From Ref. 25,26 

and the mid-infrared bands. The lowest correlations occur 
between band 4, the near-infrared band, and the other bands, 
so band 4 is the least redundant. Band 6, the thermal- 
infrared band, is most correlated with band 7. It is prac¬ 
tically uncorrelated with band 4, and it is moderately 
correlated with the other bands. Band 5 has the greatest 
variance, followed by bands 4 and 7. 












TABLE 4. STATISTICS FOR THE CORINTO THEMATIC MAPPER IMAGE 


Band nunber 

1 

2 

3 

4 

5 

6 

7 



Univariate statistics 




Mean 

93.75 

38.35 

44.17 

65.79 

86.28 

160.41 

40.18 

Std. dev. 

8.75 

6.56 

12.53 

22.70 

33.43 

12.03 

20.11 

Variance 

76.65 

43.08 

157.06 

515.24 

1117.65 

144.75 

404.41 

Minimum 

71 

21 

15 

4 

0 

134 

0 

Maximum 

255 

138 

170 

159 

255 

223 

255 



Variance*covariance matrix 



1 

76.65 







2 

52.46 

43.08 






3 

102.86 

73.34 

157.06 





4 

64.95 

89.56 

88.54 

515.24 




5 

232.33 

177.52 

372.15 

357.62 

1117.65 



6 

71.18 

45.30 

114.99 

17.33 

308.07 

144.75 


7 

136.59 

89.46 

220.56 

64.47 

611.42 

216.26 

404.41 



Correlation matrix 




1 

1.00 







2 

0.91 

1.00 






3 

0.94 

0.89 

1.00 





4 

0.33 

0.60 

0.30 

1.00 




5 

0.79 

0.81 

0.89 

0.47 

1.00 



6 

0.67 

0.57 

0.76 

0.06 

0.76 

1.00 


7 

0.77 

0.68 

0.87 

0.14 

0.91 

0.89 

1.00 


4. The MALP Site 

The MALP site is a fairly flat area of mixed terrain 
east of the town of Malpaisillo, hence the site name. The 
site name was abbreviated due to restrictions on file name 
length on the computer used in the study. Figure 5 is a map 
of the site, with the dark rectangle marking the approximate 
site boundary. Since the scrub areas show only as slightly 
shaded areas and the woodland areas do not show up at all (a 
problem of copying multi-colored maps in black and white), 
Figure 6 was created, which is an overlay of the map in 
Figure 5 showing the missing woodland terrain feature. The 
white area to the upper right of the map is outside of the 
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Figure 5. Map of the HALF Site. From Ref. 27 


six-map area and was not available during the study. The 
seven original TM band images of the site can be found in 
Appendix A. 

The statistics of the MALP image are presented in 
Table 5. The three visible bands are all highly correlated 
with one another (correlation b 0.95), indicating that there 
is substantial redundant information in these bands. There 
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Figure 6. Overlay of the HALF Nap Showing Woodland and 
Scrub 
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TABLE 5. STATISTICS FOR THE HALF THEMATIC MAPPER IMAGE 


Band nuiter 

1 

2 

3 

4 

5 

6 

7 



Univariate statistics 




Mean 

108.06 

47.60 

68.26 

73.11 

137.75 

170.26 

67.92 

Std. dev. 

12.05 

8.31 

16.45 

13.44 

24.93 

5.95 

14.02 

Variance 

145.14 

69.12 

270.65 

180.79 

621.77 

35.40 

196.49 

Minimum 

77 

27 

24 

25 

28 

139 

13 

Maxi nun 

187 

91 

147 

144 

224 

198 

244 



Variance-covariance matrix 




1 

145.14 







2 

97.08 

69.12 






3 

189.17 

134.06 

270.65 





A 

99.20 

78.44 

145.17 

180.79 




5 

259.04 

178.40 

358.80 

218.43 

621.77 



6 

27.01 

17.52 

41.90 

-7.34 

54.61 

35.40 


7 

142.34 

94.75 

194.14 

72.75 

302.90 

53.11 

196.49 




Correlation 

matrix 




1 

1.00 







2 

0.97 

1.00 






3 

0.95 

0.98 

1.00 





4 

0.61 

0.70 

0.66 

1.00 




5 

0.86 

0.86 

0.87 

0.65 

1.00 



6 

0.38 

0.35 

0.43 

-0.09 

0.37 

1.00 


7 

0.84 

0.81 

0.84 

0.38 

0.87 

0.64 

1.00 


is also considerable redundancy between the visible and mid- 
infrared bands (5 and 7), as the correlations range from 
0.81 to 0.87. Band 4 is not highly correlated with any of 
the other bands. Band 6 is also not highly correlated with 
any of the other bands. The range of pixel values in each 
band is less than that of the CORINTO image because there is 
less variety of terra • types in this image and no open 
water. 

B. BAND AMD FEATURE SELECTION 

The LANDSAT thematic mapper (TM) records images in seven 
spectral bands. Because of the amount of data and the 
related processing time, subsets or transformations of the 
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seven bands are often used to reduce the dimensionality of 
the data, and thus reduce the computation time of the clas¬ 
sification problem. 

1. Band Subsets 

There are 35 possible combinations of the seven TM 
bands and 20 possible combinations of six bands (if the 
thermal IR band is not used). Clearly, one does not want to 
analyze every possible band combination, especially when 
some of the bands are highly correlated. 
a. The ODtirouwi Tndgx Factor 

Use of the optimum index factor (OIF) is one way 
to deal with the problem of evaluating the possible band 
combinations. [Ref. 5:pp. 90-91] This technique is based 
on the amount of total variance and correlation within and 
between various band combinations. The OIF for a three-band 
subset is 

OIF -- (1) 

5 ^ Absirj) 

where s,j is the standard deviation of band k and rj is the 
correlation coefficient between any two of the three bands 
being evaluated. The three-band combination with the larg¬ 
est OIF generally will have the most information (as mea¬ 
sured by variance) with the least amount of duplication (as 
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measured by correlation). Combinations ranking close to¬ 
gether may produce similar results. 

The OIF rankings were calculated under two condi¬ 
tions: both including and excluding band 6, the thermal IR 
band. The OIF rankings for the CORINTO image are in Table 6 
and the rankings for the MALP image are in Table 7. 


TABLE 6. OPTIMUM INDEX FACTOR RANKINGS FOR THE CORINTO 
IMAGE 



Coiriainatian 


Coidaination 


Rank 

(Using all bands) 

OIF 

(Uithout Band 6) 

OIF 

1 

4,5,6 

52.501 

4,5,7 

50.094 

2 

4,5,7 

50.094 

3,4,7 

42.019 

3 

4,6,7 

50.030 

1,4,7 

41.454 

4 

3,4,6 

42.054 

3,4,5 

41.358 

5 

3,4,7 

42.019 

1.4,5 

40.760 

6 

1,4,7 

41.454 

2,4,7 

34.765 

7 

3,4,5 

41.358 

2.4,5 

33.321 

8 

1,4,6 

40.904 

1,3.4 

28.105 

9 

1,4,5 

40.760 

1.5,7 

25.129 

10 

2,4,7 

34.765 

2,5,7 

25.083 

11 

2,4,6 

33.448 

3.5,7 

24.721 

12 

2,4,5 

33.321 

2.3,4 

23.304 

13 

1,3,4 

28.105 

1,3,5 

20.889 

14 

5,6,7 

25.570 

1,2,4 

20.650 

15 

1,5,7 

25.129 

2.3,5 

20.290 

16 

2,5,7 

25.083 

1.2.5 

19.377 

17 

3,5,7 

24.721 

2,3,7 

16.038 

18 

1,5,6 

24.308 

1.3.7 

15.993 

19 

2,5,6 

24.272 

1.2,7 

14.970 

20 

3.5,6 

24.048 

1,2,3 

10.157 

21 

2,3,4 

23.304 



22 

1,3,5 

20.889 



23 

1,2,4 

20.650 



24 

2,3,5 

20.290 



25 

1,2,5 

19.377 



26 

2,6,7 

18.099 



27 

2,6,7 

17.695 



28 

1,6,7 

17.489 



29 

2,3,7 

16.039 



30 

1,3,7 

15.993 



31 

1,2,7 

14.970 



32 

1.3,6 

14.705 



33 

2,3,6 

14.027 



34 

1,2,6 

12.705 



35 

1.2.3 

10.157 
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TABLE 7. OPTIMUM INDEX FACTOR RANKINGS FOR THE MALP IMAGE 



Rank 

Co^ination 
(Using all bwidB) 

OIF 

Coafcination 
(Uithout Band 6) 

OIF 

1 

4,5,6 

39.886 

4,5,7 


2 

3,4,6 

30.478 

3.4,5 

25.124 

3 

4,6,7 

29.978 

1.4,5 

23.717 

4 

1,4,6 

29.084 

2,4,7 

23.307 

5 

3.5,6 

28.331 

1.4,7 

21.458 

6 

4.5.7 

27.519 

3.5,7 

21.448 

7 

1.5.6 

26.710 

2.4,5 

21.093 

8 

3.4,5 

25.124 

1.3,5 

19.853 

9 

2.5.6 

24.764 

1.5,7 

19.830 

10 

2.4,6 

24.143 

1,3,4 

18.867 

11 

5.6,7 

23.995 

2,4,7 

18.824 

12 

1.4.5 

23.717 

2.5,7 

18.608 

13 

3.4.7 

23.307 

2.3,5 

18.303 

14 

1.4.7 

21.458 

1,2,5 

16.826 

15 

3.5.7 

21.448 

2.3,4 

16.343 

16 

2.4.5 

21.093 

1,3,7 

16.109 

17 

1.3.5 

19.853 

1,2,4 

14.806 

18 

1,5,7 

19.830 

2.3.7 

14.718 

19 

1.3.6 

19.579 

1,2,7 

13.096 

20 

3,6,7 

19.100 

1,2,3 

12.677 

21 

1,3.4 

18.867 



22 

2,4,7 

18.824 



23 

2,5,7 

18.608 



24 

2.3.5 

18.303 



25 

2,3,6 

17.427 



26 

1.6,7 

17.244 



27 

1.2.5 

16.826 



28 

2.3,4 

16.343 



29 

1.3,7 

16.109 



30 

2,6,7 

15.676 



31 

1.2,6 

15.474 



32 

1.2.4 

14.806 



33 

2.3,7 

14.718 



34 

1,2.7 

13.096 



35 

1,2,3 

12.677 




When all seven bands are considered, the (4 56) 
band combination ranked first for both images. Band 6 also 
appears in many of the top-ranked band combinations. This 
would seem to indicate that band 6 contains information that 
is not duplicated by the other bands, even though its spa¬ 
tial resolution is poorer. Bands 4 and 5 also appear in 
many of the top-ranked band combinations. 

When band 6 is not included, the top-ranked band 
combination was the (457) combination. This is somewhat 
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unexpected, as bands 5 and 7 tend to be correlated. Almost 
all of the other highly-ranked band combinations are of the 
form [(1, 2, or 3) 4 (5 or 7)], which is what one would 
expect, since the visible bands and the mid-IR bands tend to 
be correlated. The band combination (1 2 3), the three 
visible bands, consistently ranked last. 
b. Physical A rginnowt-w 

From the spectral reflectance curves in Figure 1, 
it is readily seen that the greatest differentiation between 
the general land cover types of soil, water, and vegetation 
occurs in the mid-infrared, followed closely by the near- 
infrared. The thermal-infrared is best at differentiating 
soil from water and vegetation. The smallest differentia¬ 
tion occurs in the visible bands. 

In the CORINTO image, water and vegetation form 
the main areas of interest, so the best choice of a three 
band subset might be the three bands that appear individual¬ 
ly to be the best for discriminating those surface materi¬ 
als. Looking at the TM band characteristics in Table 2, it 
appears that bands 1, 4, and 5 would be the best choice, 
instead of bands 4, 5, and 7. The set (14 5) ranked fifth 
by OIF, but the OIF calculations included the total scene 
statistics, not just the statistics from the distinguishable 
surface features of interest. There is not enough ground 
information to classify much of the land cover in this 
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image, so the OIF may not be the best means of selecting a 
three band set. 

In the MALP image, there are no large areas of 
open water, so the main problem is to distinguish the vege¬ 
tation of interest from soil and other vegetation. From 
Table 2, it appears that band sets (145) or (3 4 5) would 
be the best choices. These band sets also ranked second and 
third by OIF. In this case, there is less scene variability 
(the band variances are less) and the cover types of inter¬ 
est are similar to the cover types in areas where no infor¬ 
mation is available. 

2. Image Transformations 

Since there is substantial redundant information in 
both images, the transformations discussed in Chapter II may 
be able to reduce dimensionality while retaining more infor¬ 
mation than a simple three band subset. 

a. The Principal Component Transformation 

The principal component transformation was per¬ 
formed on both images using the appropriate routine (KARLOV) 
from the Land Analysis System [Ref. 20:pp. KARLOV-1 to 
KARLOV-5]. The routine performs the transformation as 
outlined in Chapter II. The eigenvalues for the transforma¬ 
tion of the CORINTO image are shown in Table 8, and the 
eigenvalues for the MALP image are in Table 9. For both 
images, the first three principal component bands explain 
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more than 97 percent of the total scene variance, as calcu¬ 
lated by equation 5. The first three principal component 
band images for each site are in Appendix B. 

TABLE 8. EIGENVALUES OF THE CORINTO IMAGE COVARIANCE 
MATRIX 


Component 


number 

1 

2 

3 

4 

5 

6 

7 

Eigenvalue 

18/4.62 

473.44 

54.98 

39.07 

10.77 

4.39 

1.57 

Percent Vari¬ 
ance 

76.24 

19.25 

2.24 

1.59 

0.44 

0.18 

0.06 

Cimulative 
Percent Vari¬ 
ance 

76.24 

95.49 

97.73 

99.32 

99.76 

99.94 

100.0 


TABLE 9. EIGENVALUES OF THE MALP IMAGE COVARIANCE MATRIX 


Component 


nimber 

1 

2 

3 

4 

5 

6 

7 

Eigenvalue 

1277.50 

132.74 

66.09 

23.31 

12.42 

6.23 

1.06 

PeTent Ve'-i- 
ance 

84.08 

8.74 

4.35 

1.53 

0.82 

0.41 

0.07 

CuTulative 
Percent Vari¬ 
ance 

84.08 

92.82 

97.17 

98.70 

99.52 

99.93 

100.0 


b. The Tasseled Cap Transformation 

The other linear transformation discussed in 
Chapter II was the tasseled cap transformation. The coeffi¬ 
cients in Table 3 were used to transform both the CORINTO 
and MALP images into their greenness, brightness, and wet¬ 
ness components. A short program was written to perform the 
transformation and to scale the output images to the proper 
range (0 to 255) for the available display. The tasseled 
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cap component images are in Appendix B. The short program 
used for the transformation is listed in Appendix C. 

c. Band Ratios 

The other image transformation discussed in 
Chapter II was band ratios. Band ratios have been found to 
be useful both to reduce the topographic effect and to 
enhance certain information in an image. For example, the 
ratio of bands 3 and 4 provides vegetation information, and 
the ratio of bands 2 and 5 is useful for identifying water 
bodies and provides subtle wetland information [ref 5:pp. 
137-138]. 

To test the usefulness of band ratios, a few 
ratios of the CORINTO image were made and grouped together 
as multiband images. Since water, wetland, and vegetation 
are the categories of interest in this image, the three 
ratios (band l)/(band 5), (band 2)/(band 5), and (band 
3)/(band 4) were calculated. All three ratios were grouped 
together as one three-band image, and the three possible 
two-band combinations were grouped together as two-band 
images. The three ratio images are in Appendix B, and the 
program used to create these ratio images is listed in 
Appendix C. 

3. Band Selection 

After considering the possible band combinations, 
the most promising ones, based on the above discussion, were 
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selected. Since time and processor usage were not always a 
critical constraint on this study, for completeness the full 
seven band set was included as one of the band choices. 

This set will have all of the information available, so it 
should give a good indication of what features are and are 
not detectable. Since reducing the dimensionality does 
significantly affect processing time and would make the 
results more transportable to other processing environments, 
a number of three-band subsets and transformations were also 
selected. 

The band combinations selected for further study 

were: 


• The seven TM bands. This set has all of the sensor 
information available. 

• Bands 4, 5, and 6. This set ranked first by OIF. 

• Bands 4, 5, and 7. This set ranked was highly ranked by 
OIF and ranked first when band 6 was not included. The 
spectral reflectance curves of soil, water, and vegeta¬ 
tion have the greatest separation in this range. These 
bands respond to soil variability as manifested by 
vegetation [Ref. 15:p. 321]. 

• Bands 1, 4, and 7. Fairly high ranking by OIF (without 
band 6). A common combination of choice [Ref. 16:p. 
141]. These bands are not highly correlated, and this 
set includes one band from each of the three major 
reflective spectral regions (visible, near-infrared, and 
mid-infrared). 

• Bands 1, 4, and 5. Fairly high OIF ranking (without 
band 6). Useful for distinguishing forest cover types 
[Ref. 17]. Physical arguments also support the selec¬ 
tion of this band combination. 

• Bands 3, 4, and 5 (MALP site only). Fairly high OIF 
ranking (without band 6). Useful for distinguishing 
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forest cover types [Ref. 17]. Physical arguments also 
support the selection of this band combination. 

• Bands 3, 4, and 6 (MALP site only). Ranked second by 
OIF. 

• The first three principal component bands. 

• The three tasseled cap transformation bands (greenness, 
brightness and wetness). 

• Ratio images (CORINTO site only). Various combinations 
of the ratios (band l)/(band 5), (band 2)/(band 5), and 
(band 3)/(band 4). 

C. THE LAND ANALYSIS SYSTEM 

The Land Analysis System (LAS) is an image analysis 
system designed for use with satellite imagery. [Ref. 20:p. 
1] It provides the capability to manipulate and analyze 
digital image data and includes a wide range of functions 
and statistical tools for image analysis. It was the prima¬ 
ry software used in this study. In addition to routines for 
extracting the study sites from a Landsat scene, image 
statistics calculation, and file management functions, LAS 
includes a variety of routines for both supervised and 
unsupervised classification. All three of the routines for 
unsupervised classification (HINDU, KMEANS, and ISOCLASS) 
were used, as well as one of the supervised classification 
routines (MINDIST). The classification routines used are 
summarized below. A more detailed description of each 
algorithm is in Appendix D. 
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1. The HINDU Classification Routine 


HINDU classifies a multiband image based upon its 
multidimensional histogram. [Ref. 20:p. HINDU-1] Regions 
in the histogram with high density are regarded as pattern 
clusters. The user specifies the input image, the minimum 
and maximum acceptable number of clusters, and the number of 
gray levels per histogram bin. 

2. The KMEANS Classif'cation Routine 

KMEANS performs an unsupervised classification using 
the K-means algorithm. The basic K-means algorithm operates 
as follows [Ref. 12:p. 218]: 

• Step 1: Begin with an arbitrary set of cluster centers 
for the desired number of clusters. 

• Step 2: Compute the sample mean of each cluster. 

• Step 3: Reassign each sample to the cluster with the 
nearest mean. 

• Step 4: If the classification of all samples has not 
changed, stop. If not, go to step 2. 

3. The ISOCLASS Classification Routine 

ISOCLASS performs the unsupervised classification of 
am image using an isodata-type clustering algorithm [Ref. 
20:p. ISOCLASS-1]. The basic isodata clustering algorithm 
operates as follows [Ref. 12:p. 219]: 

• Step 1: Cluster the data into C classes. Eliminate any 
classes with fewer than T members. 
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• step 2: On every other iteration, if C < 2N then split 
any clusters whose samples form sufficiently disjoint 
groups. If any clusters have been split, go to step l. 

• Step 3: Merge any clusters whose means are sufficiently 
close. 

• Step 4: Go to step 1. 

In the algorithm description, C is the number of classes, T 
is the minimum number of pixels allowed in a cluster, and N 
is the approximate desired number of clusters. 

4. The MIMDI8T Classification Routine 

MINDIST performs a supervised classification of 
multiband images based on minimum distance from class means. 
[Ref. 20:p. MINDIST-1] It can also be used to attempt to 
improve the results of the unsupervised clustering algo¬ 
rithms, either by discarding pixels that are too far from 
cluster centroids or by reclassifying clusters based on 
cluster means and a distance rule. 

D. FEATURES OF INTEREST 

1. Identifiable Information Classes 

Given that the only ground reference available was 
1:50,000 scale maps, a key question is: what are the mean¬ 
ingful terrain classes that can potentially be identified? 
The symbols and color coding used on a map are identified in 
the map legend. Using the map legend, the following types 
of terrain can be identified in the two study sites: water, 
stream, mangrove, land subject to inundation, woodland. 
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scrub, city, and road. Grassland and cultivated land are 
other probable terrain types, but these two terrain types 
were not marked on the maps. 

Landsat TM images are appropriate sources for Level 
I and many Level II categories in the U.S. Geological Survey 
(USGS) Land Use and Land Cover Classification System [Ref. 
8:pp. 138-140] As can be seen in Table 10, almost all of 
the areas identifiable from the map match categories in the 
USGS classification system. So the prospects appear good 
that the terrain types identified above may indeed be spec¬ 
trally distinguishable. 

Some of the terrain features (e.g., streams and 
roads) will generally be linear features much less than a 
pixel (28.5 m) wide. Bernstein et al. [Ref. 21:p. 195], 
when examining a TM image of Dulles airport, found that 
linear features as small as about 7.6 m (about a quarter of 
a pixel) wide could easily be visually discerned because of 
a favorable contrast ratio between the linear feature and 
its background. Since streams tend to encourage vegetation 
to grow along their banks by providing a ready source of 
water, and since water and vegetation have greatly different 
reflectivities in the infrared wavelengths, streams (at 
least the larger ones, which also have a greater potential 
for being obstacles) should also be detectable as linear 
features of subpixel width. 


56 






TABLE 10. U.S. GEOLOGICAL SURVEY LAND USE/LAND COVER 
CLASSIFICATION SYSTEM FOR USE WITH REMOTE SENSOR DATA. 
FROM REF 8:P. 139 



Level I 


Level 11 

r 

Urban or built-up land 

IT 

Residential 



12 

Connercial and services 



13 

Industrial 



14 

Transportation, comnunicat ions, and 




services 



15 

Industrial and connercial conplexes 



16 

Mixed urban or built-up land 



17 

Other urban or built-up land 

2 

Agricultural land 

21 

Cropland and pasture 



22 

Orchards, groves, vineyards, nurser¬ 




ies, and ornamental horticultural ar¬ 




eas 



23 

Confined feeding operations 



24 

Other agricultural land 

3 

Rangeland 

31 

Herbaceous rangeland 



32 

Shrub and brush rangeland 



33 

Mixed rangeland 

4 

Forest land 

41 

Deciduous forest land 



42 

Evergreen forest land 



43 

Mixed forest land 

5 

Water 

51 

Streams and canals 



52 

Lakes 



53 

Reservoirs 



54 

Bays and estuaries 

6 

Wetland 

61 

Forested wetland 



62 

Nonforested wetland 

7 

Barren land 

71 

Dry salt flats 



72 

Beaches 



73 

Sandy areas other than beaches 



74 

Bare exposed rocks 



75 

Strip mines, quarries, and gravel pits 



76 

Transitional areas 



77 

Mixed barren land 

8 

Tundra 

81 

Shrub and brush tundra 



82 

Herbaceous tundra 



83 

Bare ground 



84 

Mixed tundra 

9 

Perennial snow and ice 

91 

Perennial snowfields 



92 

Glaciers 


2. Information Classes of Interest 

At TM pixel size, cities are not a distinct spectral 
class, but are made up of a number of sub-classes (e.g., 
residential, commercial/industrial, parks, mixed pixels), 
some of which may not be spectrally distinguishable from 
other terrain classes of interest. Since the locations of 
cities are generally known and normal weather variations 
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have much less impact on cities, identification of cities 
was not be pursued in this study. 

Roads, since they do not generally fall under the 
categories of obstacles or cover and concealment, were also 
not addressed. 

A pixel is considered to be a water pixel when it 
contains only water, i.e., when the pixel does not also 
contain some other terrain type. Stream pixels, on the 
other hand, will generally contain some other terrain type 
or types in addition to water, i.e., they will be mixed 
pixels. It is hoped that streams, at least the ones large 
enough to be potential obstacles, will be detectable by the 
influence of their water content on the spectral response of 
the pixel. It is also possible that streams may be detect¬ 
able by their effect on the vegetation lining the stream 
banks, creating a contrast between the vegetation lining the 
stream and the other local vegetation. 

Land subject to inundation may be inundated, have a 
specific type of vegetative cover (such as reeds or swamp 
grass), be barren soil, or have mixed cover types that do 
not permit them to be identified as a separate class. The 
question of separability can only be answered by examining 
the classified images. The question of exact cover type can 
not be answered here. The value of these areas as either 
obstacles or concealment will depend on the exact cover type 
or whether the area is inundated. 
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Woodland and scrub are very broad categories, but a 
finer distinction is not possible with the available ground 
information. The size, health, density, and type of woods 
or scrub can affect the spectral response and thus the 
classification of a given pixel, but any further subdivision 
of these classes would be speculation. More detailed infor¬ 
mation about subclasses would, of course, be important for 
making judgements about the quality of cover and concealment 
they may provide. If these land cover types are separable 
from the rest of the image, the value of using Landsat TM 
imagery for finding the correct state of these military 
terrain classes would be demonstrated. 
a. The CORINTO Site 

The CORINTO site was used to attempt to identify 
water obstacles. Figure 7 is an overlay of the map of the 
CORINTO site showing the information classes of interest. 
There are significant areas of open water and mangrove. 

There are numerous streams in the site, which may be detect¬ 
able as subpixel-width linear features. 

Along the streams in the eastern part of the site 
are three potential water obstacles. These potential water 
obstacles were identified on the map as being wider than 
the normal stream markings (see Figure 4). Two of these 
areas are above dams ("Presa" is Spanish for dam). Other 
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Figure 7. Overlay of the CORINTO Map Showing Water 
Features, Mangrove, and Land Subject to Inundation 
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portions of the streams could also be water obstacles, but 
no other candidates are obvious from the map reference. 

As mentioned above, the land subject to inunda¬ 
tion may or may not be identifiable as a separate class. 
There were no special markings on the rest of the site, so 
those areas were not considered in the rest of the study. 
b. The MALP Site 

The MALP site was used to attempt to identify 
vegetative cover and concealment. Figure 6 is a map overlay 
of the MALP site showing the information classes of inter¬ 
est. The only information classes of interest in this site 
are woodland and scrub. These are very broad categories, so 
it was expected that more than one or two spectral classes 
would map to each of these information classes. However, if 
portions of these classes can be separated from the rest of 
the site by the classification algorithms, that would indi¬ 
cate that further study with better ground information would 
be worthwhile. 

Human activity could complicate the evaluation of 
vegetative cover and concealment in this site. This site is 
in the more populous part of Nicaragua. Human activity, eve 
over short periods of time, can have a significant effect on 
the woodland and scrub areas through such activities as 
logging, land clearing, and firewood collection. This could 
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render the map reference used here out-of-date and lead to 
poor results. 

3. Assigning Spectral Classes to Information Classes 

Clusters were manually assigned to Information 
classes. This task was made much easier by means of a few 
simple procedures and one fortunate circumstance. When the 
map overlays of Figures 6 and 7 were copied at a 74 percent 
scale factor (available on local copying machines), they 
were almost the same size as the images produced by one of 
the available printers. The difference was about one per¬ 
cent, adequate for the task given the limited ground infor¬ 
mation. By making transparencies of the reduced overlays, 
the appropriate transparency could be laid on top of the 
printed classification image. This made the cluster identi¬ 
fication process much easier. 

The maps of Figures 4 and 5 were the ground refer¬ 
ence used to identify the information classes of interest in 
this study. The map overlays of Figures 6 and 7 were the 
references used to assign spectral classes in the classified 
images to the information classes of interest. 

E. CLASSIFICATION ACCURACY ASSESSMENT 

If a remote sensing-derived land cover map is to be 
useful, there must be some method for assessing classifica¬ 
tion accuracy. [Ref. 5:p. 225] This normally requires the 
collection of information about some parts of the terrain 
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which can then be compared with the remote sensing-derived 
classification map. This means that to assess classifica¬ 
tion accuracy it is necessary to compare two classification 
maps, the remote sensing-derived map and a reference map 
that is assumed to be accurate. The reference map may be 
derived from on site investigation or, as is often done, 
from the interpretation of remotely sensed data obtained at 
a larger scale or higher resolution. For example, research¬ 
ers often compare a Landsat-supervised classification map 
with a reference map produced by interpreting large-scale 
(e.g., 1:20,000) aerial photographs. 

"The overall accuracy of land-use maps for earth re¬ 
source management should generally be 85% and the accuracy 
must be approximately equal for most categories." [Ref. 

5:pp. 225-226] Since the only ground reference available 
for this study was the 1:50,000 scale maps, the accuracy for 
the identifiable classes may not reach that goal, and not 
all of the area in each site will be classifiable. However, 
since the goal of this study is to demonstrate the feasibil¬ 
ity of using Landsat satellite imagery for terrain analysis 
and not to manage earth resources, a lower level of accuracy 
is acceptable. 
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IV. ANALYSIS OF RESULTS 


A. PRESENTATION OF CLASSIFICATION RESULTS 

The results of the unsupervised classifications are best 
presented in image form. The different clusters in the 
classified image are assigned different colors or gray scale 
values to depict the spatial relationship of the various 
classes. Generally, spectral classes that do not contain 
information of interest are all assigned the same gray scale 
value. Here, a value of 255 (white; was used for the class¬ 
es not shown in a classified image. These classes were 
usually the "unknown” parts of the site. The spectral 
classes corresponding to the information classes of interest 
are the only colored or gray areas in the classified image. 

All of the spectral classes corresponding to one infor¬ 
mation class can be assigned the same gray scale value. 

This would be done to create a classified image containing 
only the information classes of interest. This was not done 
here. Each spectral class was assigned a different gray 
scale value so that the spectral structure of the image data 
could be clearly seen. 

Some comments accompany each classified image to assist 
the reader in interpreting the results of the classification 
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and to call attention to points of interest in the classifi¬ 
cation result. 

B. STATISTICAL SEPARABILITY OF CLUSTERS 

It is relatively easy to run a large number of classifi¬ 
cations. All that is required are computer time and pa¬ 
tience. It is much more difficult and time consuming to 
analyze all of the results. One method of selecting the 
results for further analysis is to use a statistical measure 
of class separability. 

Since class separability is a function not only of the 
distance between class means but also of the class probabil¬ 
ity distributions [Ref. 28:p. 335], a measure that includes 
both factors is needed. A measure called the divergence 
meets these criteria. A separability measure derived from 
the divergence, called the transformed divergence may also 
provide an indirect method of estimating the likelihood of 
correct classification [Ref. 29:p. 689]. 

The divergence is calculated from the mean and covari¬ 
ance of each spectral class and is a measure of the statis¬ 
tical distance between class pairs [Ref. 29:p. 688]. The 
divergence is derived from the logarithmic-likelihood ratio 
[Ref. 13;pp. 167-168]. The pairwise divergence between 
classes i and j is defined as [Ref. 29:p. 688]: 
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where p(x|i) is the probability density function of x for 
class i. 

When the classes are assumed to have normal probability 
functions, the expression for divergence simplifies to [Ref. 
13:p. 168]: 

Dij - 0.5tr( (C^-Cj) + 0.5tr( 

( 2 ) 

where C, is the covariance matrix for class i, is the 
mean vector for class i, and tr(A) denotes the trace, or sum 
of the diagonal elements, of the matrix A. 

A problem with divergence is that, as two classes are 
more widely separated in feature space, the probability of 
correct classification has an upper bound of 100 percent, 
but the divergence will continue to increase. [Ref. 28;p. 
340] One solution is to use the transformed divergence, a 
saturating function of their divergence. 

The transformed divergence is defined as [Ref. 20:p. 
DIVERGE-2]: 

Djj 

- l00(l-e ® 

where is the transformed divergence between classes i 
and j. The transformed divergence is extended to cover all 
of the class pairs by calculating the average transformed 
divergence which is simply the numerical average of 
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the transformed divergence D^-j over all class pairs [Ref 
20:pp. DIVERGE-2 to DIVERGE-3]. 

One method of using the transformed divergence is to 
select the feature set having the greatest average trans¬ 
formed divergence [Ref. 13:p. 169]. This is similar to 
maximizing the probability of correct classification. 

Another method is to select the feature set having the 
largest minimum value of transformed divergence [Ref. 13:p. 
173]. This would be the feature set with the best perfor¬ 
mance in separating the most difficult pair of classes to 
separate. 

According to Jensen [Ref. 5:p. 201], a transformed 
divergence value of 100 suggests excellent class separation; 
a transformed divergence above 95 suggests good separation; 
and a value below 85 suggests poor class separation. Haack 
[Ref. 30:p. 269], on the other hand, states that a trans¬ 
formed divergence value of 75 or greater generally indicates 
an acceptable separability of classes. An exact threshold 
value for "acceptable" separability is not as important here 
as a feel for what the value of the transformed divergence 
means, i.e., a larger value is better, and a value below the 
range of 75 to 85 means that the two clusters are not well 
separated by this measure. 
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C. SUMMARY OF CLASSIFICATION RESULTS 


A sununary of selected classifications for the CORINTO 
site is presented in Table 11. A summary for the MALP site 
is in Table 12. Both summaries include the classified image 
name, the number of clusters, and the average and minimum 
transformed divergence. The number of iterations required 


Table 11. SUMMARY OF SELECTED CLASSIFICATIONS FOR THE 
CORINTO SITE 



Classified 

Image name 

Nurber of 
clusters 

d^ave 

O^MIN 

Execution 

time 

Nurber of 
iterations 

CORINTO.CLASS1 

29 

97.94 

33.^3 


60 

CORINTO.CLASS1.HIN0IST 

29 

97.99 

40.09 



CORINTO.CLASS2 

48 



42:13 

80 

C0RINT0.CLASS2.MINDIST 

48 





CORINTO.KNEANS1 

23 

97.43 

41.95 

12:11 

20 

C0RPCA.CLASS1 

10 

86.43 

2.34 

4:09 

20 

C0RPCA.KHEANS3 

20 

85.35 

5.00 

4:11 

14 

C0fi457.KMEANS1 

23 

94.86 

34.40 


26 

CORl45.t(MEANS2 

23 

86.84 

3.13 

8:34 

25 

CORKS.CLASS1 

13 

84.93 

5.50 

9:57 

40 

C0R456.KHEANS1 

23 

86.59 

6.92 


25 

CORTC.KHEANSI 

23 

91.50 

5.00 


26 

RATIO.KHEANS1 

23 

98.00 

69.25 

6-07 

18 

RATI0.KHEANS8 

8 

98.57 

82.22 


9 

RATI012.((MEANS8 

8 

95.86 

65.73 

0:55 

9 

RATI013.KMEANS8 

8 

76.20 

5.28 



RAT 1023.KMEAAND8 

8 

70.20 

4.47 


9 

C0R147.KMEANS1 

23 

84.58 

1.56 


21 

C0RINT0.HINDU3 

23 

98.48 

58.90 

0:C5 


CORINTO.HINDU3.MINDIST 

23 

97.56 

56.09 

0:55 
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for each iterative classification algorithm is also listed. 
For some of the classifications, the time required for the 
classification (in hours and minutes on a MicroVax II) is 
included. This gives a relative measure of the speed of 
each algorithm. 

Table 12. SUMMARY OF SELECTED CLASSIFICATIONS FOR THE 
MALP SITE 


Classified 

Image name 

Number of 
clusters 


O^MIN 

Execution 

time 

Nunber 

iterati- 

HALP.CLASS4 

17 

KMim 

61.18 


60 

MALP.CLASS4.HIND 1ST 

17 

98.28 

72.00 



HALP.KHEANS2 

23 

97.96 

17.35 

17:39 

29 

HALP.KMEANS8 

8 

99.34 

94.57 


19 

MALPPCA.CLASS2 

11 

89.23 

31.49 


40 

HALPPCA.KMEANS1 

23 

91.26 

5.90 


28 

HALPPCA.KMEANS8 

8 

90.41 

43.28 


25 

HALP457.ICMEANS1 

23 

90.08 

21.90 


33 

HALP457.KMEANS8 

8 

88.15 

39.15 


39 

MALP145.CLASS1 

11 

86.46 

14.05 

4:52 

40 

HALP145.KMEANS1 

23 

90.00 

11.84 

9:51 

33 

HALP.KMEANS8 

8 

89.88 

33.30 


19 

MALP456.KMEANS1 

23 

80.23 

1.83 


38 

MALP456.KMEANS8 

8 

80.23 

14.78 


23 

MALP345.KHEANS8 

8 

90.22 

40.69 


19 

HALPTC.KMENAS1 

23 

74.94 

3.16 


24 

MALP346.KHEANS8 

8 

85.75 

5.35 


16 

MALP147.KMEANS1 

23 

86.41 

4.29 


25 

MALP.HIN0U2.HINDIST 

23 

96.82 

30.49 



HALP.HINDU3.HINDIST 

13 

98.44 

81.94 




Most of the classified images have many more clusters 
than there were information classes of interest. This was 
done because the spectral structure of the data was unknown. 
With more spectral classes than information classes, the 
additional spectral classes are either subclasses of the 
information classes of interest and can be combined, or they 
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can be ignored as classes not of interest. Classifying the 
images into too few spectral classes can result in spectral 
classes that are mixtures of information classes. Mixed 
spectral classes are of little or no use to this study. 

1. Classified Image Naming Conventions 

The naming convention used in Tables 11 and 12 
provide information about the feature set and classification 
algorithm for each classified image. The classified image 
name is composed of two or three parts: the first part is 
the feature set identifier, the second is the classification 
algorithm identifier, and the third part, if there is one, 
indicates that the MINDIST algorithm was used as a post¬ 
processing step. 

The feature set identifier part of the classified 
image name is constructed as follows: 


• CORINTO or MALP indicates that all seven of the original 
thematic mapper bands were used. 

• CORxyz or MALPxyz indicates that the three bands x, y, 
and z of the original seven thematic mapper bands were 
used. 

• CORPCA or MALPPCA indicates that the first three bands 
of the principal component transformation were used. 

• CORTC or MALPTC indicates that the three tasseled cap 
transformation components (greenness, brightness, and 
wetness) were used. 

• RATIO indicates that the three ratio bands, (band 1)/ 
(band 5), (band 2)/(band 5), and (band 3)/(band 4), were 
used. 
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• RATIOxy indicates that two of the three ratio bands were 
used, where a 1 indicates that the ratio (band l)/(band 
5) was included, a 2 indicates that the ratio (band 
2)/(band 5) was included, and a 3 indicates that the 
ratio (band 3)/(band 4) was included. 


The classification algorithm identifier was con 
structed as follows: 


• CLASS indicates that the ISOCLASS algorithm was used 

• KMEANS indicates that the KMEANS algorithm was used 

• HINDU indicates that the HINDU algorithm was used 

The number following the algorithm identifier is a reference 
number used to keep track of different results from using 
the same algorithm on the same feature set. 

The presence of MINDIST in the classified image name 
indicated that the MINDIST algorithm was used on the image 
as a post-processing step to reclassify the image using the 
Euclidean distance rule. KMEANS already uses that distance 
rule, so MINDIST was not used on any of the KMEANS classifi¬ 
cations . 

2. Input Parameters for isocLASS 

The ISOCLASS routine requires a number of input 
parameters. The parameters used are shown in Table 13. The 
meaning of these parameters is listed below [Ref. 20:pp. 
ISOCLASS-1 to ISOCLASS-2]: 

• Any two clusters whose means are closer than DLMIN are 
combined. 
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• NMIN is the minimum number of members desired in any 
cluster. Clusters that have less than NMIN members are 
deleted. 

• Any cluster whose standard deviation is greater than 
STDMAX and has more than 2(NMIN + 1) members is split. 

• CHNTHS is the threshold for chaining clusters. 

• MAXCLS is the maximum number of clusters. 

Recommended ranges on the values of DLMIN and CHNTHS are 
given in the LAS User's Manual [Ref. 20:pp. ISOCLASS-l to 
ISOCIiASS-5] . The value of STDMAX was selected by a trial- 
and-error process based on the number of clusters produced 
by different values of this parameter. As long as the value 
of NMIN was small, it did not have much effect on the clas¬ 
sification results. 

The maximum allowable value of 64 for MAXCLS was 
used. This was done because the spectral structure of the 
data was unknown. With more spectral classes than informa¬ 
tion classes, the additional spectral classes are either 
subclasses of the information classes of interest and can be 
combined, or they can be ignored as classes not of inter¬ 
est. Classifying the images into too few spectral classes 
can result in spectral classes that are mixtures of informa¬ 
tion classes. Mixed spectral classes are of little or no 
use to this study. 

A more detailed description of the ISOCLASS algo¬ 
rithm is given in Appendix D. 
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Table 13. INPUT PARAMETERS 
CLASSIFICATION ALGORITHM 

FOR THE 

ISOCLASS 

UNSUPERVISED 

Image name 

DLMIN 

STDMAX 

NMIN 

CHNTHS 

MALP.CLASS4 

3.9 

10.0 

300 

3.9 

MALPPCA.CLASS2 

3.2 

4.5 

150 

3.2 

MALP145.CLASS1 

3.9 

10.5 

300 

3.9 

CORINTO.CLASS1 

5.0 

10.5 

300 

5.0 

C0RINT0.CLASS2 

5.0 

10.5 

20 

5.0 

CORPCA.CLASS1 

3.2 

4.5 

30 

3.2 

COR145.CLASS1 

3.9 

11.0 

200 

3.9 


3. General Observations 

From Tables 11 and 12, it can be seen that the 
classified imaged with the larger minimum values cf trans¬ 
formed divergence, tend to have the larger average 

values of transformed divergence, Since larger values 

of indicate a better ability to separate hard-to-sepa- 

rate classes, the value of was used to select classi¬ 

fied images for analysis. The band sets with the largest 
values of both and are the original seven-band set, 

the (4 57) band set, the ratio images that include both the 
(band l)/(band 5) and the (band 2)/(band 5) ratios, and many 
of the MALP site classifications with eight classes. 

As can b® seen in Tables 11 and 12, processing time 
increased both as the number of bands in the input image 
increased and as the number of output clusters increased. 

The processing time required by the various clustering 
algorithms for classifying the same input band set with 
similar numbers of clusters varied widely. Since all of the 









seven-band-set results for the CORINTO site had fairly large 
values of with all of the clustering algorithm combina¬ 

tions used in the study, this permits a comparison of the 
clustering algorithms. An evaluation of the ability of this 
band set to separate the classes of interest for this site 
is also possible. 

D. COMPARISON OF CLUSTERING ALGORITHMS 
1. The HINDU Algorithm 

The HINDU algorithm runs very quickly, taking about 
five minutes to classify a seven-band, 512 x 512 pixel 
image. The one classification listed in Table 11 using this 
algorithm, called CORINTO.HINDU3, had a relatively high 
value of transformed divergence (58.90). However, the 
spectral classes in this classified image do not match very 
well to the information classes of interest, based on manual 
comparison with Figure 7. 

Figure 8 shows the classes of interest extracted 
from the CORINTO.HINDU3 classified image. Comparing Figure 
8 with the map overlay of Figure 7, it can be seen that this 
classification did detect one class of water and two of 
"mangrove." ("Vegetation and water" might be a more accu¬ 
rate description of the mangrove class. Other, clearly non¬ 
mangrove, areas were classified as belonging to this class, 
and it is not clear that mangrove can be spectrally separat¬ 
ed from the other vegetation present.) It also detected the 


74 











•• 

J 

/■ 


i 



Figure 8. The CORINTO.HINDUS Classified Image 


three potential water obstacles on the streams in the east¬ 
ern part of the CORINTO site. However, the water and man¬ 
grove do not have clear boundaries like those shown in the 
map overlay, but are mixed together in the northern part of 
the mangrove. There also seems to be a small amount of 
misclassification in the rest of the image. 
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The minimuro value of the pairwise transformed diver¬ 
gence between the three classes was 96.3. Yool et al. found 
no clear relationship between divergence values and classi¬ 
fication accuracy for individual classes, possibly because 
the assumption of Gaussian class distributions is not always 
accurate [Ref. 28:p. 689]. However, the transformed diver¬ 
gence will still be used here as a measure of the statisti¬ 
cal separability of classes. It is the best measure of 
statistical separability available in the Land Analysis 
System. 

2. HINDU Followed by MINDI8T 

As shown in Figure 9, using the MINDIST algorithm on 
the C0RINT0.HINDU3 classification result still yields one 
class of water and two of mangrove, but the separation of 
classes appears to be much more accurate. Water was better 
separated from the mangrove, more of the streams in the area 
were detected, and there were fewer pixels in these classes 
in the "unknown” parts of the site. The minimum value of 
the pairwise transformed divergence between the three class¬ 
es was 91.7. 

It takes MINDIST about 50 minutes to reclassify a 
seven-band image using the cluster centers calculated using 
a different classification algorithm. MINDIST uses the 
input cluster centers and the selected distance measure to 
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Figure 9. The CORINTO.HINDUS.MINDI8T Classified Image 


reassign all of the pixels in the image to the various 
classes. 

3. The KMEAN8 Algorithm 

The KMEANS algorithm runs at a moderate speed, 
normally four to eight hours for a three-band image and six 
to 12 hours for a seven-band image. Run times are somewhat 
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higher for a greater number of clusters and for a smaller 
value of the execution cutoff threshold, PCTCNG. Control 
over the number of clusters is very good, and the accuracy 
appears to be good also. However, the clusters may not all 
be significantly different if the analyst has specified 
incorrectly the number of clusters desired versus the number 
of clusters actually occurring in the imagt. 

The classification results of CORINTO.KMEANSl are 
similar to the CORINTO.HINDU3.MINDIST results, but two water 
classes were detected instead of one, and three mangrove 
classes were detected instead of two (see Figure 10). These 
additional spectral classes appear to be transition classes: 
one between water and mangrove, and the other between man¬ 
grove and the rest of the image. The information contained 
in these additional spectral classes might distinguish 
between terrain of significantly different obstacle value. 
Without better ground information, no definite conclusior. 
can be made. 

The minimum value of the pairwise transformed diver¬ 
gence between the five classes was 41.9 between the two 
mangrove classes. Between the remaining class pairs the 
minimum value was 94.0. This indicates that, according to 
the transformed divergence measure, the two mangrove classes 
are not well separated spectrally, but that the remaining 
class pairs are spectrally well separated. 
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Figure 10. The CORINTO.KMEANSl Classified Image 


It took 12 hours and 11 minutes to reach the a 
termination threshold of one percent of the pixels changing 
clusters in an iteration for this classification. 

4. The I80CLA88 Algorithm 


ISOCLASS generally takes longer to produce results 
than the other two unsupervised classification algorithms. 







ISOCLASS gives the user the most control over cluster sta¬ 
tistics, but there is no direct way to estimate the number 
of clusters which a given set of input parameters is likely 
to produce. Each new run of the algorithm is, in part, a 
trial to see if the resulting number of clusters is in the 
desired range. Experience helps in the selection of input 
values, but still this only provides a starting point for a 
trial and error process. Of course, this process also gives 
the researcher some insight into the spectral structure of 
the natural clusters present in the image. There is also no 
completely unambiguous sign of convergence, though there are 
normally some fairly strong indications. 

As seen in Figure 11, the CORINTO.CLASS1 classifica¬ 
tion result is almost the same as the CORINTO.KMEANSI clas¬ 
sification result, except for the assignment of gray scale 
values to classes. The same five classes were detected, 
with the same meanings and most of the same member pixels. 
There are fewer stream and "unknown” pixels in 
CORINTO.CLASS1 than in CORINTO.KMEANSl, especially in the 
northern part of the site. 

The minimum value of the pairwise transformed diver¬ 
gence between the five classes was 44.4 between the two 
mangrove classes. Between the remaining class pairs the 
minimum value was 95.7. 

The time of 27 hours and 52 minutes does not include 
the time required by several previous attempts. These 
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Figure 11. The CORINTO.CLASSl classified Image 


previous attempts were necessary to find appropriate values 
for the input parameters and to gain experience in better 
estimating "good” input parameters for the ISOCLASS algo¬ 
rithm. 

The C0RINT0.CLASS2 classification was a continuation 
of CORINTO.CLASS1 with a smaller value for the minimum 












number of pixels allowed in a cluster. The change was from 
300 to 20. Although the number of clusters increased from 
29 to 48, most of the additional clusters were small. 

Twelve of the additional 19 clusters had fewer than 300 
pixels. Of interest to this study is that the "water bound¬ 
ary" class was split into two classes. One class appears to 
be the same as described above, a transition from water to 
mangrove. The other seems to be a transition from water to 
classes other than mangrove. The water obstacles were still 
classified as mangrove and water boundary. The other water 
transition class mainly consisted of coastal pixels that 
were not near any mangrove. 

5. ISOCLASS Followed by MINDIST 

Using the MINDIST algorithm on an image that has 
been classified using ISOCLASS slightly increases the aver¬ 
age and minimum values of the transformed divergence, but 
the classification results for the classes of interest do 
not appear to be much different. As shown in Figure 12, the 
CORINTO.CLASS1.MINDIST image has slightly fewer stream and 
unknown pixels than the CORINTO.CLASSI image, but there do 
not appear to be any other differences. 

The minimum value of the pairwise transformed diver¬ 
gence for the five classes was 44.0 between the two mangrove 
classes. Between the remaining class pairs the minimum 
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Figure 12. The CORINTO.CLASS1.MINDI8T Classified image 















E. CORINTO SITE CLASSIFICATION RESULTS 

1. Comments on Class Names 

Most of the spectral clusters or classes fall into 
one of two general categories: water or mangrove. These two 
categories have one or more sub-categories in the various 
classification images. Some of these sub-categories appear 
to be the result of spectral differences in a single infor¬ 
mation class, while others seem to be transitions between 
information classes, since they predominantly occur at the 
boundaries between the information classes. The transition 
class between water and mangrove has been called "water 
boundary," and the transition class between mangrove and the 
rest of the image has been called "land boundary." 

There was, in most images, no separate "stream" 
class. Pixels for the potential water obstacles were clas¬ 
sified as being water, mangrove, or one of the transition 
classes. Other portions of streams detected were usually 
classified as mangrove or the land boundary class. "Stream 
pixel" has been used as a descriptive term identifying 
pixels along streams and normally does not identify a sepa¬ 
rate spectral class. 

2. Use of the KMEANS Algorithm 

The KMEANS algorithm was used for most of the clas¬ 
sifications made during this study. From the comparison of 
clustering algorithms above, KMEANS was more accurate than 
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HINDU, even when post-processed using MINDIST. It was also 
faster than ISOCLASS with about the same accuracy. 

3. The COR457.KMEAN81 Classification Results 

Shown in Figure 13, the COR457.KMEANSl classifica¬ 
tion had a fairly high minimum value of transformed diver¬ 
gence, 34.40. This classification resulted in five spectral 
classes for the information classes of interest. In addi¬ 
tion to one water and two mangrove classes, there was one 
class that appeared to be a mixed class containing both 
mangrove and the entire water boundary class. The other 
class was a mixed class of land boundary and streams, with 
streams making up about half of this class. This classifi¬ 
cation detected by far the most streams, and was the only 
classification with streams making up such a large percent¬ 
age of the total number of pixels in any one class. 

The three potential water obstacles were assigned to 
the water boundary/mangrove and the northernmost (black) 
mangrove classes. 

The minimum value of the pairwise transformed diver¬ 
gence for the five classes was 69.8 between the two mangrove 
classes. Between the remaining class pairs the minimum 
value was 95.7. 

4. The RATIO.KMEANSl Classification Results 

The RATIO.KMEANSl classification has a very high 
minimum value for the transformed divergence, 69.25. This 


85 









Figure 13. The COR457.KMEANSl Classified Image 

classification, shown in Figure 14, resulted in eight spec¬ 
tral classes for the information classes of interest. This 
was the greatest number of spectral classes for any classi¬ 
fication of the CORINTO site. There was one water class, 
three water boundary classes, three mangrove classes, and 
one land boundary class. 
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The three potential water obstacles had pixels that 
were assigned to two of the water boundary classes and to 


all three of the mangrove classes. 



Figure 14. The RATIO.KMEANSl Classified Image 
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For some reason, the streams detected in most of the 
other classifications were not detected here, but the stream 
in the northeast (upper right) part of the site was detected 
here and not shown in most of the other classifications. 

The minimum value of the pairwise transformed diver¬ 
gence for the five classes was 71.1 between the two larger 
mangrove classes. Between the remaining class pairs, the 
minimum value was 91.1. 

5. The RATIO.KMEANS8 Classification Results 

To test if the information classes identified in the 
above classifications were spectrally homogeneous enough to 
classify the broad information classes with fewer spectral 
classes, the RATIO band set was classified with the KMEANS 
algorithm into eight spectral classes. The resulting clas¬ 
sified image, shown in Figure 15, had three spectral classes 
of interest: one water class, one mangrove class, and one 
water boundary class. 

The potential water obstacles were classified into 
the water boundary and mangrove classes. 

The main difference between this classification 
result and the previous one, other than the different number 
of classes, is that here almost no stream pixels were, de¬ 
tected. 
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Figure 15. Tbe RATIO.KMEAMS8 Classified Image 


The minimum value of the pairwise transformed diver¬ 
gence between the three classes was 99.99, indicating excel¬ 
lent spectral separability. 

6. The RATI012.KMEANS8 Classification Results 

To test the possible utility of a two-band-ratio set 
for detecting water obstacles, the three two-band combina- 
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tions of the available ratios were classified using the 
KMEANS algorithm. Shown in Figure 16, the two-band set 
containing the (band l)/(band 5) and (band 2)/(band 5) 
ratios was the only one with a large minimum value of trans¬ 
formed divergence. Again, there are three spectral classes 
of interest: one water class, one mangrove class, and one 
water boundary class. 

The potential water obstacles were classified into 
the water boundary and mangrove classes. There were also 
more pixels of unknown information classes assigned to one 
of the three spectral classes. 

The minimum value of the pairwise transformed diver¬ 
gence between the three classes was 99.99, indicating excel¬ 
lent spectral separability. 

7. The C0RPCA.KMEAM83 Classification Results 

since the desired number of classes is an input to 
KMEANS, it is possible that the number of clusters created 
is greater than the number of spectrally distinct classes. 
This could result in a low minimum value of the transformed 
divergence. With this possibility in mind, the principal 
component transformation classification results and the 
tasseled cap transformation classification results were 
examined. 

The CORPCA.KMEANS3 classification, shown in Figure 
17, had a low minimum value of transformed divergence, 5.00. 
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Figure 16. The RATI012.KNEAM88 Classified Image 


This classification resulted in four spectral classes of 
interest; one water class, two mangrove classes, and one 
land boundary class. 

The classified image is very similar to the 
CORINTO.KMEANSl and the CORINTO.CLASS1 results. 








Figure 17. Tbe CORPCA.KMEANS3 Classified Image 

The three potential water obstacles were assigned 
primarily to the two mangrove classes, though a few pixels 
were assigned to the water class. 

The minimum value of the pairwise transformed diver¬ 
gence for the four classes was 60.2 between the two mangrove 
classes. For the remaining class pairs, the minimum value 
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was 67.3 between the lighter (more southernmost) mangrove 
class and the land boundary class. 

8. The CORTC.KMEANSI Classification Results 

The CORTC.KMEANSl classification had a low minimum 
value of transformed divergence, 5.00. This classification, 
shown in Figure 18, resulted in six spectral classes for 
the information classes of interest: one water class, one 
water boundary class, three mangrove classes, and one land 
boundary class. 

The three potential water obstacles were assigned 
primarily to the water boundary and two of the mangrove 
classes. 

The minimum value of the pairwise transformed diver¬ 
gence between the six classes was 52.9. Several of the 
other class pairs had values of transformed divergence below 
75.0. In spite of the low values of transformed divergence, 
these results are similar to the other classifications 
examined here. 

9. Summary of CORINTO Site Classification Results 

All of the algorithm and band combinations examined 
were able to detect the three potential water obstacles 
identified on the map of the CORINTO site. Most of the band 
and algorithm combinations used detected more than one 
mangrove class and a transition or boundary class between 
water and mangrove and between mangrove and the rest of the 







Figure 18. The CORTC.KMEANSl Classified Image 

site. Because of the limited ground information available, 
it is not known if these additional spectral classes corre¬ 
spond to terrain of significantly different obstacle value. 

All of the band subsets and band transformations 
detected the potential water obstacles, so a reduction in 
the dimensionality of the classification problem is possi- 
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ble. Portions of the water obstacles were classified in the 


water boundary class and, in a few classifications, in the 
water class. It is likely that there was enough open water 
in these areas to affect the sensor readings. 

Most of the classifications also detected portions 
of the streams in the study site, with the (4 5 7) band 
combination performing best at stream detection. Since most 
of the streams (other than the water obstacles) were classi¬ 
fied in the mangrove or the land boundary classes, it is 
likely that this classification was due to the effect of the 
stream's water on the sensor response. 

The minimum value of the transformed divergence does 
not appear to be a valid criteria for selecting band sets 
here. Since most of the spectral classes in the classified 
images were not used, separation of the hardest-to-separate 
spectral class may be of no consequence to the analysis. Two 
classified images were examined that had low minimum values 
of transformed divergence (CORPCA.KMEANS3 and CORTC. 

KMEANSl). However, the lowest value of transformed diver¬ 
gence between spectral classes of interest in these images 
was greater than in many of the classified images with much 
greater minimum values of transformed divergence. This is 
shown in Table 14, which shows both the minimum value of 
transformed divergence for the entire image and the minimum 
value of transformed divergence for the information classes 
of interest. 
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Table 14. MINIMUM VALUES OF TRANSFORMED DIVERGENCE 


O^HIM 


Classified 

Image name 

0 MIN 
Overall 

Between classes 
of interest 

C0RINT0.CLASS1 

35.M 

44.4 

COfUNTO.CLASSI .MINOIST 

40.09 

44.0 

CORINTO.KMEANSI 

41.95 

41.95 

CORPCA.KHEANSB 

5.00 

60.2 

C0RAS7.KHEANS1 

34.40 

69.8 

CORTC.KHEANSI 

5.00 

52.9 

RATIO.KMEANSI 

69.25 

71.1 

RATI0.KHEANS8 

82.22 

99.99 

RAT1012.KHEANS8 

65.73 

99.99 

COR INTO. HINDUS 

58.90 

96.3 

C0RINT0.HINDU3.HIN0IST 

56.09 

91.7 


F. MALP SITE CLASSIFICATION RESULTS 

The results for the MALP site were mixed. Since the 
results for all of the classifications were similar, only a 
few will be examined here. 

1. The MALP.KMEANS8 Classification Results 

The MALP.KMEANS8 classification had the greatest 
values of both the minimum and average transformed diver¬ 
gence of all of the classifications in Table 12, 94.57 and 
99.34, respectively. 

Figure 19 shows all eight classes of the 
MALP.KMEANS8 classified image. Comparing Figure 19 to 
Figure 6, the map overlay of the MALP site, one finds that 
the woodland area in the center of the site and extending to 
the northeast (upper right) is well defined, as are portions 
of the woodland in the southern part of the site. The rest 
of the vegetation classes of interest are confused with the 
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rest of the site. This confusion of classes holds for all 
of the other classification results for this site. Woodland 
and scrub were generally not distinguishable as distinct 
classes, either. 



Figure 19. The MALP.KMEANS8 Classified image 
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Figure 20 shows the two classes that make up most of 
the central woodland. The darker class also makes up a 
portion of the woodland in the southern part of the site, in 
addition to some of the scrub in that part of the site. The 



Figure 20. Woodland/Scrub Classes in the MALP.KNEANS8 
Classified Image 
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lighter class contains some areas to the southwest of the 
central woodland as well as much of the central woodland. 

To some extent, this occurred in all of the classification 
results for this site. From the ground information avail¬ 
able, there is no clear explanation for this. 

2. The MALP.KMEAN82 Classification Results 

Figure 21 shows the six classes that predominantly 
fell within the boundaries of woodland or scrub for the 
MALP.KMEANS2 classification. These two information classes 
were not spectrally distinguishable in this classification, 
so they were treated as one category. 

In Figure 21, it can be seen that the central wood¬ 
land area is again well-distinguished, as are the woodland 
and scrub in the southern part of the site. More of the 
scrub in the northwest part of the site was included in 
these spectral classes than in MALP.KMEANS8. There is still 
a significant amount of misclassification in these spectral 
classes, based on a manual comparison with the 1:50,000 
scale maps. 

Figure 22 shows the five mixed spectral classes, 
i.e., that contained large portions of both the woodland/ 
scrub and the unknown parts of the site. The same mixed 
classes are also consistently mixed in the other classifica¬ 
tions. 
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Figure 21. Woodland/Scrub Classes in the MALP.KMEAKS2 
Classified Image 

3. The MALP.CLA884 Classification Results 

Figures 23 and 24 show the five woodland/scrub and 
the six mixed classes of the MALP.CLASS4 classification, 
respectively. Again, the central woodland area and much of 
the southern woodlands were in the woodland/scrub classes. 














Figure 22. Mixed Classes in the MALP.KMEAMS2 Classified 
Image 

along with some of the southern scrub areas. Though a small 
amount of the northwestern scrub was in these classes, most 
of the scrub in that area shows in Figure 24, where it is 
mixed with unknown parts of the site. 

The spectral differences within the woodland and 
scrub classes in this site may be due to differences in the 
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Figure 23. 
Image 


Woodland/Scrub in the MALP.CLA884 Classified 


value of the cover and concealment afforded by the different 
spectral classes. The classes shown in Figure 23, for 
example, could provide good cover and concealment, while the 
classes in Figure 24 could be useless as cover or conceal¬ 
ment. The differences could also be due to some other 
reason. The difficulty in separating woodland from scrub 














Figure 24. Mixed Classes in the MALF.CIiASS4 Classified Image 


and scrub from the rest of the site indicate that better 
ground information is necessary before any definite conclu¬ 
sions can be drawn about the ability of Landsat TM imagery 
to identify suitable vegetated areas of cover and conceal¬ 
ment. 
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V. CONCLUSIONS 


A. EVALUATION OF THE ALGORITHMS USED 

Of the classification algorithms examined in this study, 
only the HINDU algorithm produced a highly inaccurate re¬ 
sult. After post-processing with the MINDIST algorithm, the 
HINDU classification results were comparable to the results 
of the other classification algorithms. 

Both the KMEANS and the ISOCLASS algorithms found more 
spectral classes in the mangrove area than did the HINDU 
algorithm. They also found what appear to be transition 
regions between information classes. The "land boundary" 
transition class was useful for identifying streams. If 
these additional spectral classes provide more or better 
information about terrain conditions, then KMEANS and ISO¬ 
CLASS would be superior to HINDU. If not, the speed of the 
HINDU-MINDIST combination would clearly be superior because 
of the much faster processing time. 

Post-processing the results of the ISOCLASS algorithm 
with the MINDIST algorithm was shown not to be worthwhile. 

B. DETECTION OF POTENTIAL WATER OBSTACLES 

All of the algorithm and band combinations examined were 
able to detect the three potential water obstacles identi¬ 
fied on the map of the CORINTO site. Most of the band and 
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algorithm combinations used detected more than one mangrove 
class and a transition or boundary class between water and 
mangrove and between mangrove and the rest of the site. 
Because of the limited ground information available, it is 
not clear if these additional spectral classes correspond to 
terrain of significantly different obstacle value. 

All of the band subsets and band transformations detect¬ 
ed the potential water obstacles, so a reduction in the 
dimensionality of the classification problem is possible. 
Portions of the water obstacles were classified in the water 
boundary class and, in a few classifications, in the water 
class. It is likely that there was enough open water in 
these areas to affect the sensor readings. 

Most of the classifications also detected portions of 
the streams in the study site, with the (4 57) band combi¬ 
nation performing best at stream detection. Since most of 
the streams (other than the water obstacles) were classified 
in the mangrove or the land boundary classes, it is likely 
that this classification was due the effect of the stream's 
water on the sensor response. 

The minimum value of the transformed divergence does not 
appear to be a valid criteria for selecting band sets here. 
Since most of the spectral classes in the classified images 
were not used, separation of the hardest-to-separate spec¬ 
tral class may be of no consequence to the analysis. Two 
classified images with low minimum values of transformed 
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divergence (CORPCA.KMEANS3 and CORTC.KMEANSl) were examined. 
As shown in Table 14, the minimum value of the transformed 
divergence between "spectral classes of interest" in these 
two images was greater than it was for many of the other 
classified images examined in this study. 

C. DETECTION OF VEGETATION PROVIDING COVER AND CONCEALMENT 

Some portions of the MALP site were spectrally separable 
as belonging to the vegetation classes identified from the 
map of the site (see Figures 5 and 6). However, much of the 
two information classes of woodland and scrub belonged to 
mixed spectral classes. These mixed spectral classes also 
included large areas outside of the woodland and scrub 
boundaries, according to the map reference. 

These mixed spectral classes could be the result of 
different species or mixes of species of vegetation in the 
different parts of the site. They could also be the result 
of different inaccuracy in the reference map, effects of the 
dry season on different species or parts of the site, or 
they could be the result of other reasons. 

Given the above possibilities, it is apparent that the 
information classes, woodland and scrub, are very broad. 

The likelihood of a homogeneous woodland or scrub class, 
even over the small area of the site (14.6 x 14.6 km) is not 
large. This is especially true when human activity is 
present and when it is related to the cover and concealment 
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value of the vegetation in the area. Better ground informa¬ 
tion is necessary before any firm conclusions can be made 
about evaluating vegetative cover and concealment with 
Landsat TM imagery. 

Most of the classified images with eight clusters had 
significantly greater minimum values of transformed diver¬ 
gence than the corresponding classified image with more 
(normally 23) clusters. However, the images with more 
clusters appear to better separate the classes of interest. 
This is probably because the classes of interest are not 
spectrally well-separated, so it is not appropriate to use 
the transformed divergence measure under these circumstances 
(i.e., to rank unsupervised classification results). 

D. SUMMARY 

1. Primary Research Question 

The primary research question examined in this study 
was: can unsupervised pattern recognition algorithms be 
effectively used on Landsat thematic mapper imagery to 
perform parts of the terrain analysis step of the Intelli¬ 
gence Preparation of the Battlefield process? Specifically, 
the study focused on water obstacles and cover and conceal¬ 
ment provided by vegetation. 

It appears that some aspects of terrain analysis can 
be performed using the methods examined in this study. 

Though further research is needed to validate and extend the 
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results, these methods may make possible rapid, current, 
large area terrain analysis, at least for certain terrain 
features. 

2. CORINTO Site Sununary of Results 

All of the unsupervised pattern recognition algo¬ 
rithms and all of the band combinations examined were able 
to detect all three of the potential water obstacles identi¬ 
fied from the map of the CORINTO site, so it appears that 
water obstacles can indeed be detected using these methods. 
The HINDU algorithm followed by the MINDIST algorithm pro¬ 
vided the fastest acceptable classification of the CORINTO 
site, where "acceptable" is a qualitative judgment based on 
comparing the classified image with the map overlay of 
Figure 7. This assumes that there is no significant obsta¬ 
cle information added by the additional spectral classes in 
the other classified images, an assumption which could be 
wrong. Better ground reference information is necessary to 
determine if these additional spectral classes add informa¬ 
tion about obstacles, and thus to determine which is the 
"best" classification algorithm. 

The simplest method of data reduction examined, band 
subsets, provided acceptable classification results based on 
a manual comparison with the map overlay of Figure 7. 
Therefore, it is probably not necessary to use any of the 
other more complicated and time-consuming methods of reduc- 
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tion of dimensionality examined in this study to detect 
water obstacles. Band subsets, using the (4 57) band 
combination, also provided the best classification of the 
streams in the site. 

3. HALF Site Summary of Results 

All Of the band combinations examined had similar, 
mixed results. Much of the two information classes of 
interest, woodland and scrub, belonged to mixed spectral 
classes. Since the cover and concealment value of the 
vegetation in these two information classes was not known, 
better ground information is necessary before any firm 
conclusions can be made about evaluating vegetative cover 
and concealment with Landsat TM imagery. 

4. Evaluation of Statistical Separability 

The minimum value of the transformed divergence does 
not appear to be a valid criteria for selecting band sets 
here. Since most of the spectral classes in the classified 
images were not used, separation of the hardest-to-separate 
spectral class in an image is of no consequence to the 
analysis. 

Most of the spectral classes of interest did have 
high values of transformed divergence, so the transformed 
divergence may be of some value in deciding which spectral 
classes add unique information to the classification re¬ 
sults . 
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E. DIRECTIONS FOR FURTHER RESEARCH 

The greatest limitation on this study was the limited 
availability of ground reference information. Further study 
with better ground information is necessary to validate or 
refute these results, as well as to discover the reason for 
the mixed results of the attempt to separate vegetative 
cover and concealment. Also, with better ground informa¬ 
tion, a better evaluation of the clustering algorithms and 
band combinations would be possible. 

Ideally, one would like to discover and catalog charac¬ 
teristic spectral response patterns for features of interest 
(e.g., water obstacles) for use with a supervised classifi¬ 
cation algorithm such as MINDIST. That would eliminate the 
requirement that the 'water obstacle' class be large enough 
to be considered to be a separate class by an unsupervised 
classification algorithm. Supervised clustering algorithms 
tend to be faster than unsupervised ones, and the post¬ 
processing requirement for assigning spectral classes to 
information classes is reduced or eliminated. More study is 
necessary to determine if such a spectral response pattern 
can be identified and applied with an acceptable classifica¬ 
tion rate. 

Militarily 'interesting' areas are often those areas 
that are different from other areas. For example, military 
excavations or camouflaged areas will not normally cover a 
significant portion of a scene, but these areas would be of 
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great military interest. Since these areas may be spectral¬ 
ly different from the remainder of a scene, they may be 
•outlier' pixels to the normal scene clusters. Using a 
reverse of the MINDIST algorithm's option to eliminate 
pixels far from cluster means, one can search for these 
outlier pixels. This has the potential to reduce the time 
required for an image analyst to identify areas of enemy 
activity, especially if it is coupled with some method of 
change detection. It could also be useful in identifying 
features such as water obstacles in scenes where there is no 
large, similar class present in the image. 








APPENDIX A - ORIGINAL BAND IMAGES 



Figure 25. CORINTO Site, Thematic Mapper Band 1 Image 
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Figure 27. CORINTO Site, 


Thematic Mapper Band 3 Image 
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Figure 29. CORINTO Site, Thematic Mapper Band 5 Image 
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Figure 30. CORINTO site. Thematic Mapper Band 6 Image 
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Figure 31. CORINTO Site, Thematic Mapper Band 7 Image 
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Figure 32. HALF Site, Thematic Mapper Band 1 Image 
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Figure 33. HALF Site, Thematic Mapper Band 2 Image 
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Figure 38. HALF Site, Thematic Mapper Band 7 Image 
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APPENDIX B 


TRANSFORMED BAND IMAGES 
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Figure 43. HALF Site, Principal Component 2 Image 
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Figure 44. HALF Site, Principal Component 3 Image 
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Figure 46. CORINTO Site, Tasseled Cap Brightness Image 


133 











134 











Figure 48. HALF Site, Tasseled Cap Greenness Image 
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Figure 52. CORINTO Site, (Band 2)/(Band 5) Ratio Image 
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APPENDIX C - SHORT PROGRAMS 

A. BAND RATIO PROGRAM 


c ratio.for 

c 

c 

c purpose: 

c this program performs the band ratioing operation, 

c It operates by calculating the pixel-by-pixel 

c division of one 512 x 512 image file by another, 

c The program consists of a main program and one 

c subroutine. The subroutine readimg reads an image 

c file, 

c 
c 

inpUt It It h It it it h it h h It-k h 1e 1e -k h h It ic it 
C 

c this program assumes that both image files are 512 x 512 

c pixels in size, stored as BYTE (or INTEGER*!) data, 

c 

c the user interactively specifies both input file names 
c and the output file name, 
c 

c************************* output ********************** 

c 

c the output image is stored in the user specified file as 
C BYTE (or INTEGER*!) data, 
c 
c 

c********************** main program ******************* 


c define and dimension variables 

byte ioimg(512,512) 
integer i,j,k,l,intimg 

real*4 numimg(512,512), denimg(512,512), scale 
character*20 imgnamel, imgname2, imgnamel 

parameter (scale = 162.3) 

c get filenames from user 
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print *, ' enter the niunerator filename ' 
read '(a)', imgnamel 

print *, ' enter the denominator filename ' 
read '(a)•, imgname2 

print *, ' enter the output image filename ' 
read *(a)•, imgname3 

c read numerator image file 

call readimg(imgnamel,numimg) 

c read denominator image file 

call readimg(imgname2,denimg) 

c divide numerator image by denominator image 
c and scale to 0-255 (real) 

do 10 i=l,512 

do 11 j=l,512 

numimg(i,j)=scale*atan2(numimg(i,j),denimg(i,j)) 

11 continue 

10 continue 

c scale the image to 0-255 (byte) 

do 85 i=l,512 

do 86 j=l,512 

intimg = int(numimg(i,j)) 
if (intimg.ge.128.0) then 

ioimg(i,j) = intimg - 256 
else 

ioimg(i,j) = intimg 
end if 

86 continue 

85 continue 

c write output image file 

open(unit=l,name=imgname3,type='new',access='direct', 
*recordsize=128,maxrec=512) 
do 100 i=l,512 

write(l*i) (ioimg(i,j), j=l,512) 

100 continue 

close(unit=l) 

end 

subroutine: readimg 


c 

c 

c purpose: 
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c subroutine to read byte input image and convert to 
c a real image 

subroutine readimg(name,image) 

byte ioimg(512,512) 
integer i,j 
real*4 image(512,512) 
character*20 name 

open(unit=l,name=name,type=•old',access='direct', 
*recordsize=128,maxrec=512) 

do 10 i=l,512 

read(l'i) (ioimg(i,j), j=l,512) 

10 continue 

close(unit=l) 

do 20 i=l,512 

do 30 j=l,512 

image(i,j)=float(j zext(ioimg(i,j))) 

30 continue 

20 continue 

return 

end 


B. TASSELED CAP TRAMSFORMATIOH PROGRAM 

c tasseledcap.for 

c 

c 

c purpose: 

c this program performs the tasseled cap 

c transformation on six input bands and produces the 

c first three tasseled cap component images: 

c greenness, brightness, and wetness, 

c It operates by calculating the transformations one 

c at a time. 

c The program consists of a main program and two 

c subroutine. The subroutine readimg reads an image 

c file and the subroutine scale scales the output to 

c the required 0-255 range and writes the output image 

c to disk, 

c 
c 

c************************* input *********************** 
c 
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c this program assumes that all image files are 512 x 512 

c pixels in size, stored as BYTE (or INTEGER*!) data, 

c 

c the user interactively specifies all six of the input 
c file names and all three of the output file names, 
c 

C*:k*4t****itit****** Ait*** OUtput ********************** 

C 

c the output images are stored in the user specified files 
c as BYTE (or INTEGER*!) data, 
c 
c 

c********************** main program ******************* 


c define and dimension variables 

integer i,j,k 

integer mindsp,maxdsp 

real*4 image(512,512), tc(5!2,5!2) 

character*20 band(6) 

character*20 green,bright,wet 

real*4 g(6)/-.2848,-.2435,-.5436,.7243,.0840,-.1800/ 
real*4 b(6)/.3037,.2793,.4743,.5585,.5082,.1863/ 
real*4 w(6)/.!509,.1973,.3279,.3406,-.7112,-.4572/ 

parameter (mindsp=0,maxdsp=255) 

c get filenames from user 

print *, ' enter band 1 filename ' 

read '(a)', band(!) 

print *, ' enter band 2 filename ' 

read '(a)', band(2) 

print *, ' enter band 3 filename • 

read '(a)', band(3) 

print *, ' enter band 4 filename * 

read '(a)*, band(4) 

print *, ' enter band 5 filename ' 

read '(a)', band(5) 

print *, ' enter band 7 filename ' 

read '(a)', band(6) 

print *, ' enter greenness (output) filename ' 
read '(a)', green 

print *, ' enter brightness (output)filename ' 
read '(a)', bright 

print *, ' enter wetness (output)filename ' 
read '(a)', wet 

c calculate and output the greenness transformation 


144 







call readlm 9 (band(l),tc) 

do 12 i=l,512 
do 13 j=l,512 

tc(i,j) = g(l)*tc(i,j) 

13 continue 

12 continue 

do 15 k=2,6 

call reading(band(k),image) 

do 10 i=l,512 
do 11 j=l,512 

tc(i,j)=tc(i,j) + g(k)*image(i,j) 

11 continue 

10 continue 
15 continue 

c scale the image to 0-255 and write to disk 

call scale(green,tc,mindsp,maxdsp) 

c calculate and output the brightness transformation 

call reading(band(1),tc) 

do 22 i=l,512 
do 23 j=l,512 

tc(i,j) = b(l)*tc(i,j) 

23 continue 

22 continue 

do 25 k=2,6 

call reading(band(k),image) 

do 20 i=l,512 
do 21 j=l,512 

tc(i,j)=tc(i,j) + b(k)*image(i,j) 

21 continue 

20 continue 
25 continue 

c scale the image to 0-255 and write to disk 
call scale(bright,tc,mindsp,maxdsp) 
c calculate and output the wetness transformation 
call reading(band(1),tc) 
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do 32 i=l,512 
do 33 j=l,512 

tc(i,j) = w(l)*tc(i,j) 

33 continue 

32 continue 

do 35 k=2,6 

call reading(band(k) , inage) 

do 30 i*l,512 
do 31 j=l,512 

tc(i,j)=tc(i,j) + w(k)*inage(i,j) 

31 continue 

30 continue 
35 continue 

c scale the inage to 0-255 and write to disk 

call scale(wet,tc,nindsp,naxdsp) 
end 

c 

c 

c subroutine: scale 

c 

c purpose: 

c subroutine to scale inage to 0-255 and write it to disk 

subroutine scale(ingnane,nuning,nindsp,naxdsp) 

integer i,j 

integer nindsp,naxdsp 

byte ioing(512,512) 

real*4 nuning(512,512),ninnag,naxnag 

character*20 ingnane 

ninnag = l.OelO 
naxnag = 0.0 
do 80 i=l,512 

do 81 j=l,512 

if (nuning(i,j).It.ninnag) then 
ninnag = nuning(i,j) 
elseif (nuning(i,j).ge.naxnag) then 
naxnag = nuning(i,j) 
end if 
81 continue 
80 continue 

do 85 i=l,512 

do 86 j=l,512 
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niimlmg (i, j) = (numimg (i, j) -minmag) * (maxdsp/ (maxmag-minmag)) 
if (nxmiing(i, j) .gt.l27) then 
ioiing(i,j) = numimg(i,j) - 256 
else 

ioimg(i,j) = numimg(i,j) 
endif 
86 continue 
85 continue 

c write output image file 

open(unit=l,name=imgname,type='new',access='direct', 
*recordsi2e=128,maxrec=512) 
do 100 i=l,512 

write(l'i) (ioimg(i,j), j=l,512) 

100 continue 

close(unit=l) 

return 

end 

c subroutine: readimg 

c 

c purpose: 

c subroutine to read byte input image and convert to 
c a real image 

subroutine readimg(name,image) 

byte ioimg(512,512) 
integer i,j 
real*4 image(512,512) 
character*20 name 

open(unit=l,name=name,type='old',access='direct', 
*recordsize=128,maxrec=512) 

do 10 i=l,512 

read(l'i) (ioimg(i,j), j=l,512) 

10 continue 

close(unit=l) 

do 20 i=l,512 

do 30 j=l,512 

image(i,j)=float(j zext(ioimg(i,j))) 

30 continue 

20 continue 

return 

end 
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APPENDIX D - THE LAND ANALYSIS SYSTEM 


A. OVERVIEW 

The Land Analysis System (LAS) is an image analysis 
system designed for use with satellite imagery. [Ref. 20:p. 
1] It provides the capability to manipulate and analyze 
digital image data and includes a wide range of functions 
and statistical tools for image analysis. In addition to 
routines for extracting the study sites from a Landsat 
scene, image statistics calculation, and file management 
functions, LAS includes a variety of routines for both 
supervised and unsupervised classification. All three of 
the unsupervised classification routines (HINDU, KMEANS, and 
ISOCLASS) and one of the supervised classification routines 
(MINDIST) are described below. 

B. THE HINDU CLASSIFICATION ROUTINE 

HINDU classifies a multiband image based upon its multi¬ 
dimensional histogram. [Ref. 20:pp. HINDU-1 to HINDU-3] 
Regions in the histogram with high density are regarded as 
pattern clusters. 

1. User Input 

The user specifies the input image, the minimum and 
maximum acceptable number of clusters, and the number of 
gray levels per histogram bin [Ref. 20:p. HINDU-2]. 
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2. Algorithm Description 

Each bin or cell of the multidimensional histogram 
is examined for neighbors that have a higher density. [Ref. 
20:p. HINDU-2] The low density cells are then assigned in 
proportion to their high-density neighbors. This reassign¬ 
ment is carried out from the lowest to the highest density 
cell, recalculating the density at each stage. The histo¬ 
gram is then searched for entries of greater than average 
density. These entries are considered as possible clusters. 
If there are too few clusters, the program aborts. If there 
are too many clusters, those with a lower significance are 
deleted to obtain the a number of clusters in the specified 
range. Each pixel is assigned to the nearest cluster. 

HINDU is suitable primarily for Landsat images. 

C. KMEANS 

KMEANS performs an unsupervised classification using the 
K-means algorithm. [Ref. 20:p. KMEANS-1] Input images can 
have up to 24 bands and the algorithm can produce classified 
images with up to 64 clusters. 

1. User Input 

The user specifies the input and output image names 
and the following parameters [Ref. 20:p. KMEANS-1]: 

• NCLUST - number of clusters desired. 

• MAXIT - maximum number of iterations. 
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• PCTCNG - threshold value of the percentage of pixels 
changing cluster assignment in an iteration. If the 
percentage of pixels changing cluster assignments be¬ 
tween iterations falls below thing value, clustering has 
converged and execution is terminated. 


2. Algorithm Description 

KMEANS operates as follows [Ref. 20:pp. KMEANS-2 to 
KMEANS-3]: 


• Step 1: Compute the image means and standard devia¬ 
tions. 

• Step 2: Determine the location of initial cluster 
centers. 

• Step 3: Assign data to clusters using the minimum 
Euclidean distance rule. 

• Step 4: Update cluster centers using the assignments of 
step 3. 

• Step 5: Stop if MAXIT is exceeded or if the percentage 
of pixels changing clusters was less than PCTCNG. 
Otherwise, go to step 3. 

• Step 6: Compute and print statistics. 


The location of initial cluster centers is given by 


Center {cluster k, band!) ~ {k-1) (1) 

^ ^ NCLUST -1 

where mj is the mean value of band i, is the standard 
deviation of band i, and k ranges from 1 to NCLUST. 
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D. I80CLA88 


ICOSLASS performs unsupervised classification of a 
multispectral image using an isodata-type clustering algo¬ 
rithm. [Ref. 20;p. ISOCLASS-1] Input images can have up to 
24 bands and the algorithm can produce classified images 
with up to 64 clusters. ISOCLASS had the capability of 
continuing a classification by reading an input statistics 
file from a previous execution. 

1. User Input 

The user specifies the input and output image iiames, 
the output statistics file name, and the following parame¬ 
ters [Ref. 20:p. ISOCLASS-1 to ISOCLASS-2]: 

• MAXIT - Maximum number of iterations. 

• DLMIN - Two clusters whose means are closer than DLMIN 
are combined. 

• NMIN - Minimum number of members desired in any cluster. 
Clusters that have less than NMIN members are deleted. 

• STDMAX - Any cluster whose standard deviation is greater 
than STDMAX and whose number of members is greater than 
2(NMIN + 1) is split. 

• MAXCLS - Maximum number of clusters. 

• CHNTHS - Threshold for chaining clusters. 

2. Algorithm Description 

ISOCLASS operates as follows [Ref. 20:pp. ISOCLASS-2 
to ISOCLASS-3]: 
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• step 1: ISOCLASS reads the initial cluster centroids 
from the statistics file, or assumes that all of the 
data are a single cluster and computes the mean and 
standard deviation vectors. The mean vector is split 
(see below). 

• Step 2: Data is assigned to clusters using the minimum 
cityblock distance rule. 

• Step 3: Cluster means and standard deviations are 
computed. 

• Step 4: If MAXIT has been reached, go to step 9. 

• Step 5: All clusters with fewer than NMIN members are 
deleted. 

• Step 6: The type of iteration, split or combine, is 
determined (see below). 

• Step 7: Cluster centroids are split or combined (de¬ 
pending on the type of iteration). 

• Step 8: Go to step 2. 

• Step 9; Statistics are computed and a summary is print¬ 
ed. The statistics are stored in the output statistics 
file. 

• Step 10; The image is chained (see below). 


a. Splitting Clusters 

A cluster is split in the jth band if the clus¬ 
ter's maximum standard deviation is in the jth band, the 
standard deviation in the jth band is greater than STDMAX, 
and the cluster has more than 2 (NMIN + 1) members. [Ref. 
20:pp. ISOCIiASS-3 to ISOCLASS-4 3 On a given iteration, all 
clusters that meet the criteria are split, as long as the 
maximum number of clusters has not been reached. Once the 
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maximum number of clusters has been reached, classification 
continues without the creation of new clusters. 

b. Determining the Type of Iteratloa 

ISOCLASS begins with a sequence of split opera¬ 
tions. [Ref. 20:p. ISOCLASS-5] This sequence ends when at 
least 80 percent of the clusters have standard deviations 
less than STDMAX. At that point, the operations alternate 
between combine and split operations until the last itera¬ 
tion, which is always a split operation. The initial se- 
(juence of split operations is to initialize the cluster 
centers. The sequence of initial split operations is short¬ 
ened considerably if the initial cluster centers are provid¬ 
ed in an input statistics file. 

c. Chaining Clusters 

The last step is to chain all clusters with 
intercluster distances less then CHNTHS. [Ref. 20:p. ISO¬ 
CLASS-5] The chaining procedure was adopted because the 
minimum variance procedure used in ISOCLASS tends to form 
ellipsoidal clusters with Gaussian distributions. While the 
Gaussian distribution is natural and is normally satisfacto¬ 
ry, there could also be natural groupings of data that are 
oddly shaped which cannot be approximated by a Gaussian 
distribution. 

The statistics of the chained clusters are not 
calculated because the chained cluster cannot be represented 
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by a Gaussian distribution. [Ref. 20:p. ISOCLASS-5] The 
chained clusters are not combined in the classified image. 
Instead, a message is printed in the classification summary 
to indicate that clusters meet the chaining criteria. 

E. NINDIST 

MINDIST performs a supervised classification of a multi¬ 
band image based on minimum distance from class means. 

[Ref. 20:p. MINDIST-1] It has an option for specifying the 
maximum distance a pixel can be from the nearest cluster's 
center and still be assigned to that cluster (this can be 
used for classifying pixels as "unknown," too far from any 
cluster). 

MINDIST can be used to improve on the results of the 
unsupervised clustering algorithms, either by discarding 
pixels that are too far from cluster centroids or by reclas¬ 
sifying the image using a different distance rule. 

1. User Input 

The user specifies the input and output image names, 
the output statistics file name, and the following parame¬ 
ters [Ref. 20:p. MINDIST-1 to MINDIST-2]: 

• MAXDIST: The maximum distance a pixel can be from the 
nearest cluster centroid and still be assigned to that 
cluster. The options are for a pixel to be assigned to 
the nearest cluster no matter how far away, for the user 
to supply a value for each class, and for a single 
maximum distance for all classes. If the pixel is 
greater than MAXDIST away from the nearest cluster 
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centroid, it is assigned a value of 0, signifying an 
unclassified pixel. 

• WEIGHTS: Weights to apply to each input image band. 

• METRIC: Specifies the measure used to calculate dis¬ 
tance from the cluster centers. The CITYBLOCK and the 
EUCLIDEAN distance measures are available. 


2. Algorithm Description 

Each pixel is assigned to a cluster based on the 
selected distance rule. 

The CITYBLOCK distance rule operates as follows 
[Ref. 20:p. MINDIST-2 to MINDIST-3]: 

CDj - W,\x,-Vi ( 2 ) 

i-l 

where CDj is the "city block" distance between pixel x and 
the mean of cluster j, n is the number of bands in the 
image, x, is the value of pixel x in band i, is the mean 

in band i of class j, and W,. is the weight assigned to band 

i. 

The EUCLIDIAN distance rule operates as follows 
[Ref. 20;p. MINDIST-2 to MINDIST-3]: 


EDj - 






1-1 


(3) 
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where EDj is the Euclidean distance between pixel x and the 
mean of cluster j, n is the number of bands in the image, x- 
is the value of pixel x in band i, is the mean in band i 
of class j, and is the weight assigned to band i. 

The output pixel is assigned to the class j which 
has the minimum distance as calculated using the chosen 
distance rule. [Ref. 20:p. MINDIST-3] If MAXDIST was 
selected to classify pixels as "unknown” and the minimum 
distance is greater than MAXDIST, then the output pixel is 
assigned a value of 0. 
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